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§ 1. InrropucTion 


Tue physical properties of diamonds have been investigated by scientists 
for several centuries ; the diamond is the hardest known substance, it 
has superlative optical properties as a gem-stone and it retains its perfec- 
tion indefinitely ; it is thus a material of exceptional interest. Much 
information of historical value has been collected (Williams 1932) and 
it has long been known that the purest available diamonds consist almost 
entirely of carbon with barely detectable traces of other elements such as 
silicon and magnesium. Crystallographers soon established that diamond 
crystallizes in the cubic system but perhaps the greatest single step forward 
in the understanding of its structure was the demonstration over forty 
years ago by Bragg (see Bragg 1933), that, from x-ray diffraction data, 
the carbon atoms are arranged as shown in fig. 1. : 


§2. THrory oF THE PRopERTIES OF PURE DiaMonD 


The main x-ray diffraction pattern obtained for all diamonds is in- 
variably the same (Lonsdale 1947), indicating that the space lattice is a 
face-centred cube with two atoms at 000 and 414, associated with each 
lattice point. The bonding is tetrahedral, each carbon atom having four 
nearest neighbours to which it is linked by straight bonds inclined at 
angles of 109° 28’ to each other. Since the atomic number Z is equal to 
6, two of the six electrons are in the K shell and take no part in chemical 
binding. The four remaining electrons have orbitals formed by super- 
position of wave functions having symmetry 2p,, 2p, and 2p., and 
the particular grouping of carbon atoms determines the way in which 
these four orbitals are formed. Considerations from wave mechanics 
show that the principle of the maximum overlapping of orbitals (Coulson 
1941) requires each of the four unpaired electrons to form a covalent 
bond with four nearest neighbours by the overlapping of tetrahedral 
orbitals, the carbon nucleus residing at the centroid of the regular 
tetrahedron. 

Such a simple, symmetrical, covalent structure would indicate that 
diamond might prove exceptionally suitable as a prototype for theoretical 
study of the solid state. As for other solids its density can be calculated 
from a knowledge of the mass of the carbon atom and the lattice spacing 
given by x-ray diffraction data. More complex calculations can also be 
made to give the theoretical values of a great many of its physical proper- 
ties, mechanical, thermal, optical and electrical (Kittel 1953). Such 
calculations may be made on a classical basis on the assumption of certain 
laws of force between the atoms, but it is now clear that treatment based 
on quantum theory is essential (Peierls 1955). The present article is 
chiefly concerned with recent work on some optical and electrical properties 
of diamonds, and consequently physical properties other than these will 
be treated here only as additional evidence. The optical and electrical 
properties are determined by the transitions of bound or free electrons 
between the allowed energy levels in the diamond. The theoretical 
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problem is therefore the calculation of these levels and of the probabilities 
of transitions between them. The essential feature of an electron travellin g 
through a crystal structure is that of an electron moving in a region 
where the potential is varying in a periodic way (Mott and Gutney 1950). 
The appropriate solutions of Schrodinger’s equation show that there 
are forbidden ranges of energy where solutions representing an electron 
moving through the crystal do not exist. For diamond the allowed bands 
separate into an (empty) conduction band and a (full) valence band 
with a forbidden energy gap between these two bands. From plausible 
values of the potential and the known structure of diamond, the energy 
gap H may be calculated to be (Herman 1952) nearly 6 ev, with all 
available levels in the valence band filled and the conduction band 
completely empty when the crystal is in its lowest state. Theory therefore 
predicts that diamond will be an electrical insulator since available 
thermal energy for excitations of the valence electrons amounts to 
only a small fraction of the forbidden energy gap. If, however, energy 
equal to or greater than H is provided by the passage of an energetic 
ionizing particle through the diamond, it could momentarily become 
a conductor due to excitation of one or more valence electrons into 
the conduction band. This excitation could also occur optically for 
wavelengths shorter than about 2200 4, this wavelength corresponding to 
the quantum energy of the forbidden gap. No optical absorption should 
occur in diamond for longer wavelengths ; also, because of the electrical 
symmetry, no infra-red absorption should occur if the lattice vibrations 
are entirely simple harmonic. 

For every valence electron excited to the conduction band a corre- 
sponding excess positive charge or * positive hole ’ remains in the valence 
band. Quantum theory indicates that the electrons in the conduction 
band and the positive holes in the valence band may be accelerated by an 
external electric field relative to the crystal lattice ; the effective electron 
masses, designated by m* may be greater or less than m, the free electron 
mass. Theory indicates in general that m*/m <1 for states near an energy 
discontinuity, so that the effective masses of electrons and holes produced 
by just ionizing a diamond would be expected to be somewhat less than 
the free electronic mass. It has been shown theoretically (Seitz 1948) 
that the mobility of electrons excited into the conduction band in diamond 
should be about 156 cm? sec~! volt~1, if m*/m is taken to be unity. 

Summarizing some of the theoretical predictions, pure and perfect 
diamond crystals would therefore be expected to be good electrical insula- 
tors and to exhibit no optical absorption, no photoconductivity and no 
luminescence for exciting wavelengths, of wavelength above 2000 a. 
They should of course be free from birefringence. 


§ 3. PHysicaL BEHAVIOUR OF REAL DIAMONDS 


The physical behaviour of real diamonds is in striking contrast to the 
expected behaviour of a perfect crystal, no diamond having been yet 
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reported with all the above characteristics and all diamonds showing 
a wide range of deviations from some or most of the predicted properties. 
The occurrence of coloured diamonds as well as those of water-white 
appearance early showed the existence of optical absorption at wave- 
lengths above 2200, but even water-white specimens also showed 
deviations in many of the other physical properties. The first empirical 
classification (Robertson ef al. 1933) divided observed water-white 
diamonds into types 1 and 2, according to whether the ultra-violet 
absorption became pronounced at wavelengths near 3000 a or 2200 4 
respectively ; subsequent research however (Champion 1952) has shown 
the existence of intermediate types. All types show some infra-red 
absorption, but type I shows extremely high absorption in the range of 
8-10. Similarly, luminescence is more marked in type 1 specimens. 
Some photoconductivity occurs in all specimens for wavelengths much 
longer than the observed ultra-violet absorption limit but it does not be- 
come pronounced unless \<3000 4 for specimens of type 1 and unless 
\<2200 4 for type 2. On the other hand, although neither diamonds 
of type | or type 2 are perfect insulators, they often both have resistivities 
above 10!4ohm.cm. Both act as counters or solid ionization chambers, 
giving charge pulses if an electric field is established across the specimen 
and an ionizing particle traverses it. The magnitude of the pulse is 
however, much greater for specimens of type 2, some of type 1 requiring 
such a high electric field to demonstrate the effect that dielectric break- 
down may occur instead. Finally, diamonds of type 2 occur which are 
coloured and which are good conductors of electricity (Custers 1952). 
They are sometimes designated as type 2 b, type 2 a referring to the insu- 
lating variety of type 2. 

The above properties are those of natural diamonds, the corresponding 
properties for artificial diamonds (Bundy ef al. 1955) not having yet 
been reported. Diamonds which have been subject to intense high energy 
radiation show more marked deviations from the theoretical expectations 
for the perfect crystal. Thus, under neutron bombardment a water- 
white specimen becomes green (Dugdale 1953) and its counting properties, 
its ultra-violet transmission (Pringsheim 1953) and its electrical resistivity, 
are reduced (Benny and Champion 1954). A specimen of type 2 however, 
does not acquire the properties of one of type 1 on irradiation. The 
radiation damage shows no annealing at temperatures less than about 
600°C and the annealing is then only partial. 

The density of diamonds is always less than that expected from the 
lattice parameter and radiation damage reduces the density still further 
(Primak et al. 1953). 

The balance of evidence shows that diamonds of type 2 exhibit more 
perfect cleavage than type 1 (Wilks 1955). Birefringent specimens are 
common in both type 2 and type 1 varieties. Some attempt has been 
made to correlate these variations with secondary features of the X-ray 
diffraction pattern (Grenville-Wells 1952) but with little success, 
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Taken as a whole it is clear that specimens of type 2 most nearly 
approach the properties predicted theoretically for perfect diamond. 
This is supported by their greater thermal conductivity (Berman 1953) 
which is several times that of copper, but which is still much less than that 
to be expected theoretically. 


§4. APPLICATION OF Derect THEORY 


The constancy of the x-ray diffraction pattern for natural diamonds 
shows that the main spatial variation in electron density is the same for 
both type 1 and type 2 specimens. To account for the optical and elec- 
trical differences between the two types, the possibility has been suggested 
that the structure remains perfect, but that the electron bonding shows 
variations leading to alternative sets of energy levels and a different value 
of the energy gap. This seems, however, most unlikely ; two electronic 
structures giving the same lattice parameter would be an extraordinary 
coincidence, and in any case the crystal should take up which ever 
structure has the lower free energy. No transformation between one 
phase and the other has ever been observed, as would be expected if they 
were different phases. 

On the other hand, any deviation from the perfect diamond structure 
caused by lattice defects will lead to profound changes in many of the 
physical properties from those of ideal, pure diamond. At least seven 
types of defect may exist and any real diamond may contain some or all 
of them, which will frequently be distributed anisotropically throughout 
a given specimen. In general the defects will be relatively few in number 
and their direct influence on the lattice structure will be extremely 
localized. The constancy of such physical characteristics as the main 
x-ray diffraction pattern and the specific heat indicate that the defects 
must be of such a nature as to degrade the symmetry of the lattice in such 
a way as to leave certain physical properties unchanged while producing 
great variations in others. 

The seven possible different defects are : 1, Dislocations and Mosaicity. 
2, Single vacant sites. 3, Aggregates of vacant sites. 4, Single inter- 
stitial carbon atoms. 5, Aggregates of interstitial carbon atoms. 
6, Substitutional foreign atoms. 7, Interstitial foreign atoms. These 
seven types of defect are capable of some further sub-division for the 
ageregate interstitial carbon atoms of defect group 5 may be linked to form 
a variety of structures such as graphite embedded in the main diamond 
matrix, while the interstitial foreign atoms of group 7 may be of several 
different elements. Defining perfect, pure diamond as in § 2, however, it 
is clear that all types of defect imply the existence of some breakdown of 
the perfect diamond structure in a pure diamond matrix and some of the 
observed physical properties of any selected specimen could be attributed 
to any of the several types of defects listed above. To determine the 
exact nature of the defects in any given diamond requires either the 
determination of a large number of its physical properties or an accurate 
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quantitative comparison between theory and experiment for one of more 
of its selected physical properties. As experimental techniques are not 
yet sufficiently developed for any but a few accurate quantitative measure- 
ments, and since solid state theory involves many assumptions and approxi- 
mations which make accurate quantitative prediction uncertain, the best 
available method at present for the determination of the nature of the 
defects consists in the measurement of several diverse physical properties 
of the specimen and comparison with a correlation table such as that 
shown on page 393. 

All types of defect will destroy the symmetry of the lattice and hence 
will allow the characteristics of the vibrational system to appear in the 
infra-red absorption spectrum. Similarly, the disturbance of the per- 
fectly periodic distribution of electrical potential will give rise to electric 
field gradients which, proceeding from the defect as centre, polarize the 
normal lattice in the vicinity of the defect and produce a wide variety of 
optical and electrical phenomena which would be absent in the perfect 
lattice. 


To illustrate the considerations which form the basis of the correlation 
table, the effects to be expected from the presence of vacant sites will first 
be considered (Champion 1956). It should be stressed, however, that 
attention is focused here almost entirely on the bonding between the carbon 
atoms and not, as in many consideration of crystal structure, on any 
particular crystal plane. The plane diagrams used to develop these 
ideas do not therefore refer to any particular crystal plane and do not 
represent the projection of the atoms in a diamond lattice upon any 
particular plane. 
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4.1. Single Vacant Sites 


If a single carbon atom is missing from the interior of a diamond, the 
situation changes from that exhibited by the perfect diamond lattice 
shown by full lines in fig. 1 to that shown by dotted lines, where X indicates 
the vacant site and the dotted lines extending from X represent the 
covalent bonds which have now been disrupted by the removal of X. 
Considerations of the probable rearrangements of the bonds at the defect, 
to form a new equilibrium configuration is much facilitated by using a 
schematic two-dimensional diagram of the bonding as shown in fig. 2 (a). 


Fig. 2 


(a) 


An atom at A, fully supported on all sides by ordinary covalent bonding, 
is relatively normal in its behaviour except for certain secondary effects 
to be discussed later. The atoms, B, C, D, and EK, however, are unable 
to complete one of their normal covalent bonds owing to the presence of 
the vacant site. 

It is suggested that the fourth electron of atom E can be shared equally 
between atoms B and D which are equidistant from the atom E in the 
two-dimensional diagram. Similarly, the unpaired electron of atom B is 
shared with atoms C and H, and so on round the vacant site. The net 
effect is that the atoms around the vacant site are linked by electron. 
bonds, for which the term defect bonds is suggested, operating on three 
atoms instead of two as in the normal covalent bond. Such a picture will 
apply to any single vacant site in diamond, irrespective of the crystal 
plane from which the atoms have been removed, because of the complete 
symmetry of the tetrahedral linkage. The planar array in fig. 2 (a) does 
not, of course, correspond to any actual crystal plane but although it is 
not an exact equivalent of the three-dimensional situation shown in fig. 1, 
there is a complete correspondence in the number of degrees of freedom 
depicted by the two representations. This is because while in the plane 
diagram the atoms B, C, D, and E are depicted as coplanar, they are 
not depicted as all equidistant, whereas in the actual three-dimensional 
case, the atoms B, C, D, and E are, in fact, equidistant but they are not 
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coplanar. The bonding does, however, indicate apparent conformity 
with the principle of maximum overlapping of orbitals. 

The replacement of one normal bond by a defect bond for each atom 
around the vacant site and the asymmetric distribution of the two types 
of bond about the atomic centre suggests that differences will arise in 
certain physical properties of the defective lattice compared with the 
normal lattice. Massey (1950) has pointed out that an ordinary free 
neutral carbon atom possesses a screened field of such a nature as to allow 
the formation of a stable negative ion in an energy level appropriate to 
this field. The severance of one of the normal bonds and its replacement 
by a weaker defect bond is equivalent to imparting a partial freedom to the 
atoms surrounding the vacant site and consequently they are able to 
exhibit some of the features of free carbon atoms. The ability to form 
negative ions in the solid would be equivalent to the trapping of electrons 
and hence vacant sites would be expected to act as shallow electron traps. 
Immediately a vacant site has acquired a charge it will lead to other 
profound ‘changes in the physical properties, for the surrounding atoms 
will become polarized thus allowing the infra-red vibrational spectrum to 
appear. Ifthe effects produced are regarded as perturbations imposed on 
the properties of the normal lattice, this polarization must lead to a 
reduction of the normal energy gap between the valence bond and the 
conduction band in the vicinity of the charged site, for the polarization 
will produce a slightly increased freedom of the extranuclear electrons 
from their nuclear binding. In fact, the effect of the polarization would 
be expected to be more far-reaching than that of a small perturbation on 
the normal energy gap in that intermediate energy levels would be created 
and electronic transitions allowed which are completely absent in the 
perfect crystal. Optical absorption would then be exhibited at a con- 
siderable number of wavelengths ; for all these effects to appear, however, 
a mechanism would have to be present to maintain the vacant sites in an 
ionized state. ‘The more numerous the vacant sites, the larger the overall 
effect will be in any diamond specimen. At a sufficiently high defect 
density the affected domains may overlap in certain directions, leading to 
large variations in the electrical and optical properties of the specimen, 
while leaving the bulk of the lattice structure almost unaffected. 

However, not only will a single vacant site act as an electron trap but 
alternatively it will act as a trap for positive holes as will now be shown by 
making use of further schematic diagrams. The arrival of a positive 
hole at B in fig. 3 (a), consists in the removal of one of the covalent 
electrons. This broken bond may be reformed by the acquisition of an 
electron from A, whereupon the positive hole will be said to have diffused 
to Aina normal manner. Alternatively, however, the defect bond at BS 
which corresponds to an electron less tightly bound than at a normal 
covalent bond, will reconstitute the normal bond at B by a transition from 
its higher energy level. The new situation is depicted in fig. 3 (b). The 
charge is now trapped at the vacant site because to re-enter the lattice 
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would require the fracture of a normal covalent bond, and energy which 
is not available is required for this. However, not only is the hole trapped 
at the atom B but it is shared with the atoms C, D, and EK, since the 
defect bonds will switch from one pair of atoms to another around the 
vacant site. The positive hole is therefore no longer associated with any 
one atom but with the whole vacant site ; that is to say, it has become part 
of the defect system. , 

The electronic energy levels of the defect system will now change from 
those of a neutral vacant site, to those of an ionized vacant site. While 
therefore, the ground levels of the trapping system of a neutral site will 
be very shallow, only a small way below the conduction band, that of the 
ionized vacant site will be considerably deeper. In fact, except at a very 
low temperature indeed, the trapping of electrons at a neutral vacant site 
will be extremely short-lived since the traps will be emptied almost 
immediately by thermal energy. On the contrary, the ionized single 
vacant site will have important detectable effects, whether the ionization 
has arisen from the acquisition of a positive hole which has travelled 
through the valence band from some position where it has been produced 
by some ionizing agent, or from the direct action cf some ionizing agent 
exciting one of the defect bond electrons into the conduction band. 


If one of the electrons in the neutral defect is not excited into the 
conduction band but into one of the possible excited energy levels of the 
positive hole left by the defect electron, the combination gives rise to what 
we may call an exciton bound at the vacant site defect. Such excitation 
could be caused by the absorption of radiation of the appropriate wave- 
length or alternatively by the trapping of a free electron from the conduc- 
tion band at a vacant site which itself has already trapped a positive hole. 
It therefore follows that the quantitative measure of the trap depth below 
the conduction bond should (when subtracted from the energy gap between 
the top of the valence band and the bottom of the conduction band) 
coincide with a characteristic defect level for absorption. 

As the Schrodinger equations appropriate to defects have not yet been 
solved, exact comparison of theory with experimental data is not possible. 
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However, if the energy levels of the excited electron in the field of the 
trapped positive hole are at all similar to those of a hydrogen atom, the 
electron trap depth or ionization potential appropriate to this situation 
will be given by E/K? where £ is the ionization potential of the hydrogen 
atom and K is the dielectric constant of diamond. Since H=13-6 ev 
and K=5:8 for diamond, the ground state should be at 0-4 ev below the 
conduction band. Taking the experimental value of the energy gap to be 
that corresponding to the onset of continuous absorption at 2200 4, 
which corresponds to 5-6 ev, the ground state of the exciton regarded as 
a hydrogen-like atom would be at 5-2ev or about 23604. Recently 
Champion and Humphreys (1957) have made a broad survey of the 
ultra-violet absorption properties of many diamonds. It has been shown 
that one class of diamonds is characterized by very good transmissions 
to 2200 a except for the existence of a distinct band near 2360 a. This is 
shown experimentally as a marked dip in the current of the photo multi- 
plier recorder at this wavelength. The simple assumptions made in the 
above theory prevent the obvious identification being made with any cer- 
tainty but the qualitative optical and electrical properties to be expected 
theoretically for diamonds containing mainly single site defects are 
certainly those shown by this type. In the accompanying table 1 on 
p. 393 these diamonds are designated as class 3 if they are supposed to 
contain an abundance of single vacant sites and class 2 if they 
contain only a very small density of single vacant site defects, for such 
specimens are regarded as showing the nearest approach to the class 1 
diamond which is free from defects. Class 2 diamonds have such a low 
density of single vacant sites that they show no appreciable absorption 
band at 2360 a, the trace being free from any evidence of a band, or 
just showing a small inflection (Champion and Humphreys). They 
have almost all the properties of the perfect diamond but the electrical 
techniques show evidence from the shape of the curve showing the height 
of counting-pulse as a function of field, for the existence of the few single 
vacant site traps. Saturation would set in, and at a lower electrical 
field than observed, and electrical polarization would be absent, if the 
density of single vacant sites was zero. They are, of course, very good 
counters. As the classification of the specimen rises from class 2 to class 3, 
so the counting properties deteriorate and the absorption band at 2360 a 
grows stronger. One specimen is recorded (Champion 1952) as of such a 
high single vacant site defect density as to be a non-counter at ordinary 
field strengths and to showa prominent isolated band near 2400, by simple 
photographic recording. While most of the diamonds considered by 
Champion ef al. were small regular octahedra, the class 3 specimens 
referred to above occurred in the form of a very thin flat plate. 

The infra-red absorption behaviour is more difficult to predict and 
interpret and is discussed more fully on p. 402. For example it is clear 
that the exciton formed by an electron in the field of a positive hole which 
is itself trapped at a vacant site, can be released into the conduction band 
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by radiation of energy 0-4 ev or wavelength about 3 p. Bands occur 
experimentally in this region but they may arise from anharmonic 
forces present in the lattice in the neighbourhood of the asymmetry caused 
by the single vacant site defects. They should certainly be more intense 
for class 3 diamonds than for class 2 but infra-red absorption experiments 
have not yet been made on the specimens. 

The theory outlined above indicates that no luminescence is to be 
expected by diamonds in class 2 or class 3 since if the vacant sites are 
uncharged, no energy levels occur which would allow transitions in the 
visible spectrum. Experimentally this is confirmed, such diamonds 
being markedly free from luminescence under longer wavelength ultra- 
violet excitation. They exhibit however, a blue fluorescence under «- 
particle irradiation, but secondary effects produced by bond breaking and 
remaking, excitation by the intense electric field and the lattice damage 
produced by the particle may account for this. 


4.2. Aggregate Vacant Sites or Cavities 


The situation arising when several vacant sites coalesce may be 
schematically depicted as in fig. 2 (b) for a special symmetrical case. An 
atom situated at F is equivalent to an atom situated in a true surface of 
the crystal, although it is an interior surface in a cavity within the crystal. 
The incomplete normal covalent bond will now give rise, by linkage to its 
neighbouring surface atoms to a second type of defect bond. This 
situation will apply equally to all the atoms on the surface of the cavity, 
except for atoms such as B and E situated on both sides of angular edges 
of the cavity in the actual three-dimensional situation. If the crystal is 
of octahedral habit, all the edges at which the surface planes of the 
cavities meet will ideally be formed by planes at an angle of 109° 28’, 
Hence at every cavity edge for octahedral specimens, atoms such as B 
and EK will be linked with each other by a third type of defect bond, 
characteristic of the crystal habit. 

The effects of these cavities on the physical properties of diamonds will 
be qualitatively similar although quantitatively different from those 
produced by single vacant sites. Again the symmetry of the lattice will 
be disturbed and optical absorption and charge trapping will be in evidence. 
Because of the greater distortion of the defect bonds for cavities, the energy 
levels of electron traps will lie somewhat deeper but they will still act 
primarily as positive hole traps. A new phenomenon, however, now 
appears in that while it seems improbable that a single vacant site could 
accommodate more than one positive hole because of their mutual repul- 
sion, such multiple trapping could occur at a cavity site. Ifa hydrogen 
model is still applicable, for Z=2, the defect ionization potential for a 
doubly charged cavity would be four times that for a single vacant site, 
that is at 1-6ev. With the energy gap of 5-6 ev, this implies that the 
ground state would lie at 4-0 ev or about 3000 4 above the valence 
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level. It may therefore be plausibly suggested that the second broad 
classification termed type 1 diamonds, which show such a marked ultra- 
violet absorption for all wavelengths shorter than 3000 a, correspond to 
diamonds having an abundance of cavity sites. A wide variety of possible 
energy levels clearly arises according to the size of the cavity. Sometimes 
the energy levels will be those analagous to the hydrogen molecule, and 
at other times even to the helium molecule. The latter have energy 
states deeper than those of the hydrogen atoms and would therefore give 
rise to levels lower than 4:0 ev above the valence band. Excitation to 
states of much lower energy now become possible for cavity sites than can 
occur for single vacant sites only and transitions between these levels 
would be expected to contribute to luminescence which could be excited 
by near ultra-violet radiation. Delayed de-excitation would be exhibited 
as phosphorescence and hence diamonds in classes higher than class 3 in 
table 1 would be expected to show luminescence and phosphorescence. 
Such luminescence is indeed, characteristic of type 1 diamonds (Bull and 
Garlick 1950). 

In table 1, class 4 diamonds are defined as those supposed to contain 
single vacant sites and cavity sites, both at low density compared with the 
proportion of perfect lattice in the specimen. Such diamonds would be 
expected to show a faint absorption at wavelengths shorter than 3000 a, 
the weak density of defects preventing the exhibition of intense absorption 
peaks. One of their main distinguishing features is that while the overall 
very small density of defects allows them to be good counters, the cavities 
will act as barriers to the passage of electric charge. Hence a mono- 
energetic beam of ionizing particles will not give rise to a pulse spectrum 
of one uniform height nor will it show quantitatively the collection of the 
full charge released by the ionizing agent, even at saturation field. This 
is because the cavities block the passage of certain portions of the charge 
on its way to the electrodes and this leads to a wide distribution of pulse 
heights although the incident radiation is monoenergetic. Almost all . 
diamond counters show such a pulse height distribution (Champion 1956), 
but a complete interpretation requires some allowance also for the effect 
of field inhomogeneity across the specimen. 

It is common in the literature to divide the energy of the ionizing 
particle by the maximum charge collected at the electrodes, the result 
appearing as electron volts per ion pair. This value is large for the best 
counters and small for poorer counters. Since the number of defects, 
however, even for the worst counters or non-counters is not greater than 
1 in 10° atoms of the normal lattice, it is inconceivable that the energy 
required to produce an ion pair in any specimen should differ appreciably 
from that for a perfect diamond. ‘The correct inference is that the charge 
collected at the electrodes is reduced, compared with the charge actually 
produced, in direct proportion to the trapping density, that is the defect 
density of the specimen, and not that the differences arise from genuine 
differences in the energy per ion pair, 


396 F. C. Champion on some 


As the density of single vacant site defects and cavities increases, so the 
ultra-violet absorption tail below 3000 a becomes more marked and the 
counting properties become progressively poorer. Champion and 
Humphreys give spectroscopic records of these specimens and when the 
density of defects has risen to such a magnitude that almost complete 
absorption sets in for wavelengths longer than 3000 4, accompanied by 
such intense charge trapping that counting pulses are no longer visible, 
the specimens have precisely those characteristics originally associated 
with type 1 diamonds. They are represented by class 5 in table 1. The 
ultra-violet absorption below 30004 is accompanied by the onset of 
photoconductivity ; no discrete energy levels are detectable in the con- 
tinuum. Experimentally therefore for type 1 diamonds the energy gap 
is only about 4ev as compared with 5-6 ev for the ideal diamond lattice. 
The interpretation of this narrowing of the energy gap is that it arises 
from the complete overlap throughout the specimen of the depression of 
the normal energy level. This situation implies the high defect density 
already postulated for type 1 specimens and allows an approximate quanti- 
tative estimate to be made. Assuming the hydrogenic model, a multiply 
charged cavity will exert an appreciable field up to about 25 a from its 
centre. Hence these fields will overlap if the cavities are about 504 
apart. With atomic separations of about 1 A, the three-dimensional 
defect density is thus about one in 10°. 


4.3. Dislocations and Mosaicity 


Grenville-Wells (1952), following on the work of Lonsdale, showed that 
type 2 diamonds exhibited mosaicity under test by the divergent x-ray 
beam method far more frequently than type 1 specimens. However, the 
best counters were those type 2 diamonds which were also free from 
mosaicity. Itis inferred that mosaicity and dislocations act as additional 
trapping barriers but that their optical and electrical effects do not differ 
essentially from those produced by single and aggregate vacant sites. 
The discontinuities at the crystallite boundaries which form the disloca- 
tions and mosaic structure may therefore be treated as a special case of 
vacant site defects. | However, the absence of mosaicity in type 1 
diamonds naturally led initially to the suggestion that from x-ray data 
alone, type 1 diamonds were the most ‘ perfect °. This is in such striking 
contrast to all the other properties of type 1 diamonds that it is now 
generally agreed that the ‘perfection’ indicated only freedom from 
mosaicity. Indeed, a closer study of the x-ray pattern of type 1 diamonds 
showed the existence of additional spots or streaks which were absent 
from the type 2 diffraction patterns. This may well indicate that the 
vacant site defects or other possible imperfections in type 1 specimens 
are distributed throughout the lattice in a sufficiently well-organized 
fashion to give freedom from mosaicity, although the overall defect 
density is one or two orders of magnitude larger than that in type 2 
specimens, This would account for the comparative independence of 
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the two x-ray effects based on the divergent beam pattern and the 
presence of anomalous streaks, for the former depends on mosaicity while 
the latter depends on ordered defect density. The question as to why the 
defect density and distribution vary so widely as to constitute the key to 
the difference between type 2 and type 1 diamonds is bound up with the 
question of the conditions of growth of the two types. Speculations on 
diamond growth are deferred until § 5. It may, however, be noted that 
those diamonds free from mosaicity are also those most free from bire- 
fringence. No detailed study has yet been made on birefringence but 
from the little work which has so far been done (for example, Freeman and 
van der Velden 1950), absence of birefringence correlates strongly with 
uniformity of overall texture whether the specimen is type 2 or type 1. 
The relative unimportance of mosaic structure in affecting the counting 
properties is paralleled in thermal conductivity (Berman 1953). 


4.4. Interstitial Carbon Atoms 


The question of the effect of interstitial carbon atoms in the diamond 
lattice is of exceptional interest since the open nature of the lattice permits 
the insertion of isolated carbon atoms without great disturbance of the 
lattice. Thus, if the carbon atoms are regarded as solid spheres having a 
diameter such that two nearest neighbours’ atoms just come into contact 
along the tetrahedral bond directions, such a ‘ free ’ carbon atom may be 
placed at the mid-point between two normal lattice atoms in the cube 
directions. Interaction with the lattice would then be confined to such 
as would support the interstitial carbon atom against gravity. There are 
several reasons why such a theoretically possible arrangement does not 
occur in practice. First it is difficult to imagine any conditions of diamond 
growth which could lead to such a situation. Second, a lattice which 
contained a high percentage of such interstitial atoms would have a 
density up to more than 50% larger than that to be expected from the 
mass of the carbon atoms and the normal lattice spacing for diamond. 
Indeed, the fact that the experimental density is always less than that for 
the perfect lattice indicates that the defects must be such that the vacant 
sites occur far more frequently than interstitial carbon atoms. On the 
other hand, the insertion of two or more carbon atoms in juxtaposition 
would lead to strong lattice distortion which could hardly fail at appre- 
ciable concentration to be reflected in deviations from the main x-ray 
pattern which have not been observed in natural diamonds (Straumanis 
and Aka 1951). Even at small concentration the localized distortion of 
the lattice would be expected to lead to the creation of levels in the 
normally forbidden gap and hence to optical absorption and coloration of 
the diamond as discussed in the next paragraph. There therefore seems 
no possibility of explaining the variations in natural water-white diamonds 
in terms of ‘ free ’ interstitial carbon atoms. 

Irradiation of diamonds by neutrons, however, would be expected to 
give rise to displaced carbon atoms which are forced into interstitial 
positions, leaving a vacant site where the carbon atom has been removed, 
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Such experiments have been made by Benny and Champion (1954) and 
others (Kinchin and Pease 1955) and the results have already been 
mentioned on p. 386. Such artificial defects augment some of the effects 
produced by natural defects already present, and lead to certain effects 
not shown by the natural defects, such as partial annealing at high tem- 
peratures and green coloration of the previously water-white specimen. 
This leads to important conclusions. First it shows by deliberate creation 
that pure lattice defects do lead to optical absorption and to electron 
trapping, that is the creation of energy levels in the forbidden zone. 
Second, it shows that the artificial defects are not identical withthe natural 
defects. It follows that the majority of the defects in natural diamonds 
are produced during the growth process and cannot be attributed to the 
subsequent action of penetrating radiations on diamonds which were 
initially of perfect growth. Moreover, the failure of the natural defects 
to anneal shows that these do not consist of close vacancy-interstitial 
pairs and this leaves the evidence overwhelmingly in favour of vacancy 
defects. 
4.5. Foreign Atoms 


From the density deviations from that expected for an ideal lattice, 
from calculations on trap density when used as counters or solid ionization 
chambers (Champion 1953) and from thermal conductivity and other 
measurements, it is clear that the defect density in the water-white type 
of diamonds considered so far rarely exceeds one in 10° atoms of the per- 
fect lattice and is often much less. Since so many of the physical proper- 
ties may be explained in terms of vacant sites alone, the proportion of 
foreign atoms and consequently their direct effect on the behaviour of the 
diamond must be surprisingly small. Nevertheless, chemical tests 
(Chesley 1942) have shown that even the purest diamonds contain a minute 
percentage of foreign atoms such as silicon and magnesium. Three 
ways in which these foreign atoms may be present are (a) as inclusions of 
appreciable aggregates of the foreign atoms, (b) substitutionally in the 
lattice, (c) interstitially. There is plenty of microscopic evidence for 
the existence of inclusions but their effect, apart from the production 
of birefringence due to strain produced, may be largely regarded as 
separating one portion of pure diamond from another. On the other 
hand, if they occur as substitutional or interstitial atoms in the lattice 
their influence on the physical properties of the specimen would, according 
to the theory already developed, be expected to be profound as shown in 
the schematic diagram fig. 4. A major classification may be made 
according to whether the foreign atom can, like silicon, satisfy the carbon 
valency, or fail to satisfy it by occupying some other chemical group. 
However, even with substitutional silicon, the electronic charge density 
in the neighbourhood will differ from that of the perfect lattice and the 
general result will be optical absorption and electrical behaviour associated 
with the creation of energy levels within the normally forbidden energy 
gap. Such behaviour will be even more pronounced if the silicon atom is 
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interstitial rather than substitutional and several adjacent silicon atoms 
could clearly give rise to silicon carbide grouping. The two questions 
which present themselves therefore are: (1) why are the effects of the 
small proportion of foreign atoms present in the purest diamonds not 
more in evidence than might be expected, and (2) are there any diamonds 
which exhibit properties which undoubtedly require the presence of 
foreign atoms for their explanation ? 

In answer to the second question it may be stated that quite apart 
from chemical evidence, there is strong evidence that foreign atoms are 
present in the purest diamonds of both types 1 and 2, when the existence 
of the dark current is considered. It has been already stated that many 
diamonds in both catagories show a specific resistance of about 
1014 ohm cm which is considerably less than the perfect insulation to be 
expected from the energy gap of 5-6ev. Illumination by ultra-violet 
light allows the levels due to vacant site defects to become operative and 
the scintillations produced by ionizing particles will have a similar effect. 


Fig. 4 


Nevertheless, a diamond, which has been carefully shielded from such 
modes of excitation and in which the effects of past excitations have been 
removed, yields a dark current which shows quite clearly the existence 
of acceptor or donor levels in the forbidden gap. It is concluded that these 
arise from foreign atoms which are therefore responsible for some of the 
optical absorption and some electrical phenomena. It is considered that 
the failure of such foreign atoms to produce more profound effects than 
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they do must arise from the presence of these foreign atoms at some 
position in the diamond where they produce singularly little effect on the 
lattice. Such a position is clearly afforded by the surfaces of the cavity 
vacant sites as in fig. 2 (b) for an abundance of foreign atoms can be 
accommodated in such cavities without giving rise to major changes in all 
the physical characteristics of the diamonds. Moreover, their presence 
would result in values of the densities of different diamonds specimens 
which, while they are extremely irregular, show overall densities which are 
never very far removed from that of the perfect lattice. Thus while at 
first sight the density of type 1 diamonds, with their greater density of 
cavity sites, might be expected to be less than that of type 2 specimens 
if such cavities were completely empty, the presence of a varying number 
of foreign atoms within the cavities would lead to densities that might 
be either greater or less than any selected type 2 specimen. Such minute 
variations in density are very difficult to measure (Smakula and Sils 
1955), but as far as accurate density measurements have been made this 
is the experimental observation, the correlation between density and 
diamond type being extremely poor. Further, since the number of 
charges supplied by the acceptor or donor levels would be at least approxi- 
mately proportional to the density of cavity sites containing foreign atoms, 
the charge trapping would also be proportional to the cavity density and 
hence the overall conductivity of the specimens would be expected to be 
independent of the cavity density. Hence type 1 and type 2 diamonds 
would be of about the same specific resistance, which is the experimental 
observation. 

Since the presence of donor or acceptor atoms seems essential for the 
explanation of the dark current it is of interest to enquire whether they 
alone could account for the optical behaviour which has hitherto in this 
article been ascribed to vacant sites. An acceptor level at about 0-4 ev 
above the valence band would automatically lead to many of the proper- 
ties of diamonds already listed. Thermal excitation would allow a certain 
proportion of the acceptor sites to be filled, thus releasing positive holes 
as a supply of dark current in the valence band. The infra-red absorption 
at about 0-4 ev is obviously accounted for, while the observed ultra-violet 
absorption at about 2360 a, corresponding to 5-2 ey, is just that required 
to raise an electron from a filled acceptor to the normal conduction band. 
Since the filled acceptor is a negative ion, an electric field exists in its 
neighbourhood which will polarize the lattice and allow interband transi- 
tions. This explanation, however, fails because, among other reasons 
of the absence of experimental correlation between the magnitude of the 
dark current and the intensity of the spectral absorption of the specimen. 
While, therefore such foreign acceptor atoms no doubt contribute to the 
optical absorption in just the wavelength region required, some additional 
defect is required which will give the necessary optical absorption but not 
contribute to the dark current. This is, of course, one of the properties 
already ascribed to the vacant site, 
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The green coloration of water-white diamonds under neutron bombard- 
ment strongly suggests that the distortion of the lattice by varying 
numbers of interstitial carbon atoms gives rise to energy levels in the 
forbidden gap such as to allow transitions in the visible region. Illumina- 
tion by white light then results in selective absorption by excitation from 
the valence level to levels which leave green as the complementary colour. 
Diamonds occur naturally with a wide variety of colours such as blue, 
green, purple, pink, yellow, orange and brown, and by analogy with the 
artificially coloured green diamonds, where interstitial carbon atoms are 
involved, it is suggested that these other colours arise from foreign atoms 
present substitutionally or interstitially in the lattice. There is clearly a 
very large number of theoretical possibilities and it would seem that syste- 
matic comparison work on synthetic diamonds incorporating a variety of 
foreign atoms under varied conditions of crystal growth is necessary to 
decide the constitution of some specimens with certainty. However, a 
certain narrowing of alternatives is still possible by considering some of 
the other physical properties of a specimen besides its colour. For 
example, a water-white diamond which has been coloured green by neutron 
bombardment, anneals to golden brown. It is inferred that the non- 
diamond disarray of the carbon atoms produced by neutron impact 
rearranges to graphite which is the stable crystalline form at normal 
pressure. This more ordered arrangement results (Benny and Champion 
1956) in an improvement in the optical transmission (Ditchburn et al. 
1954) in the magnitude of the counting pulses, and in the electrical 
resistance of the specimen, in a manner quite consistent with a partial 
restoration of the regularity of the potential variation throughout the 
crystal. The absence of any appreciable annealing below red-heat shows 
that even the unstable arrangement of carbon atoms within the diamond 
matrix is very stable compared with the radiation damage produced in 
germanium and other materials having the diamond structure. The 
activation energy is therefore extremely high for the interchange of 
interstitial positions, normal lattice positions, and vacant sites, and the 
absence of annealing of natural vacant site defects in water-white speci- 
ments is not surprising. Because of the variety of phenomena attendant 
upon different possible arrangements of different foreign atoms, any 
simple classification of diamonds possessing foreign atom impurities is 
searcely practicable. However, in table 1 the general effect of relatively 
small and relatively large proportional of foreign atoms in the diamond 
lattice is represented by classes 6 and 7 respectively. 


§ 5. Discussion oF somME UNSOLVED PROBLEMS 


It will be evident that although a wide variety of experimental facts 
are known concerning the properties of many diamonds, and that solid 
state theory is capable of interpreting these in a general way, many of the 
problems cannot yet be said to have been solved completely and un- 
ambiguously. Three of these problems are now selected and discussed. 
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5.1. Conducting Diamonds 


It has already been stated that several cases have been reported of 
diamonds which should be classed rather as conductors than insulators, 
since their specific resistance is orders of magnitude less than that of 
insulating diamonds. Some of these have been designated type 2 (b) 
since they show appreciable ultra-violet transmission yet differ in some 
way from ordinary type 2 (a) diamonds, which have resistances greater 
than 10!4ohmem. The properties of coloured 2 (b) specimens may be 
explained generally by describing them as type 2 material containing 
non-diamond in the form of foreign atoms or graphitic carbon distributed 
through the specimen. Such diamonds exhibit many of the properties of 
semiconductors, their resistance dropping as the temperature rises ; 
moreover, since the conduction is p-type, it occurs in the valence band 
of the diamond matrix. However, Custers (1955) has also reported 
colourless type 2 (b) diamonds ; thus it does not necessarily follow that the 
defects which give rise to conductivity also give rise to coloration. 
From what has already been said on p. 400 et seq., the simplest explana- 
tion is that if the diamond is conducting and coloured, at least some of the 
foreign atoms enter substitutionally or interstitially in the diamond 
lattice, whereas if the diamond is colourless, the foreign atoms are in the 
form of aggregates in cavities throughout the specimen. Also, if the 
number of foreign atoms within the cavities is very small or concentrated 
in one part of the cavity, the diamond will be type 2 (a), whereas if it is 
large or extended throughout the cavity it will be type 2 (b). Consistent 
with this interpretation is the fact that type 2 (b) diamonds shown a 
green-blue phosphorescence which is undetectable in type 2 (a) specimens 
Certainly both type 2 varieties show irregular morphology and a markedly 
laminated structure which suggests that the lamination may contain a 
sufficiently extensive layer of non-diamond in the type 2 (b) variety to 
render it conducting by the indirect process of the release of positive 
holes in the valance band. It has long been known (Stratton and 
Champion 1952, Taylor 1956) that all diamonds are non-homogeneous in 
their trap distribution. It is possible that electrodes controlled by micro- 
manipulator techniques may throw light on the validity or otherwise of 
these suggested explanations. 


5.2. Infra-Red Absorption Spectrum 


A considerable number of investigations (Sutherland et al. 1954) has 
been made on the infra-red absorption spectrum of diamonds, but there 
is yet no fully satisfying interpretation of the observed results. The main 
experimental fact is that both types of diamond show absorption bands 
between 4-54, whereas a strong absorption between 8-10 shown by 
type 1 diamonds is absent from the type 2 variety. However, the extine- 
tion coefficient of the 8-10 » band varies widely in different type 1 speci- 
mens and correlates well with the position of the ultra-violet cut. off. 
In this respect type 2 and type 1 diamonds form one continuous group, 
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as already indicated in table 1 where they occupy classes 4 and 5, for they 
differ mainly in the density of vacant sites. While, however, as shown on 
p. 392 et seq., the vacant sites above may give rise to energy levels which 
result in transitions in the infra-red, a simple variation with vacant site 
density, either in the form of single vacant sites or cavities will clearly not 
explain why the absorption band at 4—5 p is of comparable intensity for 
all diamonds, whereas the band at 8-10 is confined to the type 1 variety. 
It is inferred that the band at 8-10, involves some additional feature, 
which in line with other hypotheses of this article, receives a possible 
explanation in terms of foreign atoms contained in cavity sites. However, 
it is clearly not permissible to explain simultaneously both the properties 
of type 1 diamonds and of type 2 (b) diamonds by the presence of the 
same foreign atoms arranged in the same way. The difficulty is avoided 
if type 2 (b) diamonds are supposed to consist mainly of type 2 material, 
the inclusions being not foreign atoms but carbon atoms arranged as in 
graphite, whereas type 1 diamonds contain isolated cavities containing 
varying amounts of foreign atoms. Such an explanation would require 
the density of type 2 (b) diamonds to be less than that of type 1 specimens 
and the little available evidence supports this. Additional supporting 
information for regarding some of the type 1 characteristics as closely 
bound up with foreign atoms contained in cavities is that the intensity of 
some of the 8-10 absorption bands is temperature independent. This 
contrasts with the 4-5 bands which show temperature variations and 
seem to apply to the diamond lattice only (Collins and Fan 1954). Again, 
neutron irradiation does not give rise to foreign atoms but only to a 
disordered arrangement of carbon atoms. Hence it should not and does 
not give rise to the presence of the 8-10 yw infra-red absorption band charac- 
teristic of type 1 diamonds, when type 2 diamonds are irradiated. 


5.3. Diamond Growth 


There seems little doubt that the artificial synthesis of diamonds has 
now been achieved (Bundy ef al. 1955) and the way has thus at last been 
opened to a comparison of the properties of various natural specimens 
with those of synthetic specimens produced under controlled physical 
conditions and of given foreign atom concentration. Unfortunately such 
studies are at present confined to one laboratory. The conclusions of 
Neuhaus (1954) concerning the need for diamond seeding to avoid the 
metastable formation of graphite even in the stable formation range of 
diamond, are not born out in practice, and some speculations prior to 
1955 on the conditions necessary to diamond growth have now been at 
least partially replaced by factual experimental data. The equilibrium 
phase diagram for carbon as now understood is shown in fig. 5. Only 
the boundary between graphite and its vapour at relatively low pressures 
is accurately established by experimental measurements. Rossini and 
Jessup (1938) calculated thermodynamically the position of the boundary 
between graphite and diamond at lower temperatures, the extension of the 
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curve to higher temperatures being a plausible extrapolation. In experi- 
mental work Bridgman had used pressures up to 400 000 atm. at oe 
temperature and has momentarily attained 3000 kg cm? at 3000°K. 
The failure to grow diamonds at room temperature even at these pressures 
is attributed to the negligible transformation rate. By developing some 
new ways of distributing stress and giving support to critical parts, Bundy 
et al. have succeeded in constructing pressure vessels which operate at — 
pressures up to at least 100000 kg cm~ at temperatures greater than 
2300°K for hours of continuous operation. Under these conditions 
processes were discovered, the details of which have not been released, 
but which yielded diamonds of linear dimensions from 100 microns to 
Imm. Nucleation and growth occur spontaneously without the necessity 
of seeding. 
Fig. 5 
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Phase diagram of carbon 


Until further factual details become available it is permissible to make 
some small speculation concerning the growth of natural diamonds. 
First, the wide variation in crystal habit exhibited by natural diamonds 
does not by itself imply a wide variety of conditions for diamond growth, 
since the artificial specimens show octahedral, tetrahedral and dodecahe- 
dral habits. Nevertheless, the counting technique (Taylor 1956) shows 
that there is a wide variation both in overall defect density from specimen 
to specimen, and within one and the same specimen. The presence of 
triangular cavities or trigons on octahedral diamonds, and of the different 
types of etch figures (Tolansky 1955) according to the crystal habit, 
suggests certain features of the microcrystalline textures of different 
specimens. The suggestion that some specimens contain predominantly 
single vacant sites while in others they are aggregated into cavities in 
which the foreign atoms mostly reside, needs a satisfying explanation in 
terms of the conditions of crystal growth. The occurrence of mosaicity 
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and birefringence implies that in many cases natural growth has occurred 
under conditions which have not been steady for more than short periods 
of growth, the conditions being such that the specimen has been subjected 
to shearing stresses and even highly non-equilibrium conditions. From 
counting and other techniques, taking the density of vacant sites of water- 
white specimens as 1 in 10 of normal lattice sites and supposing that this 
proportion is equal to exp (—LW/kT) with W=10 ev, the temperature 7’ 
at which the specimen must have grown to give the observed number of 
vacancies is somewhat above 2000°K. It is, of course, necessary that the 
vacant sites then present in equilibrium should have been frozen in at 
room temperature by a rapid cooling process. Many natural diamonds, 
therefore, bear evidence of having been formed, like the- synthetic speci- 
mens, at temperatures between 2000-3000°K but to have subsequently 
been cooled by some catastropic geophysical process. It is this process 
which must have given rise to many of the growth features of natural 
diamonds such as the mosaicity, the aggregate vacant site defects and the 
growth trigons. Although little more than a guess may be made at 
present as to an actual sequence of events, the following process appears 
consistent with many of the properties of the final product. There is 
almost no evidence that crystallization in diamond proceeds by spiral 
growth, although this is undoubtedly possible in silicon (Bond and 
Andrus 1956). At pressures of about 100 000 kg cm? and temperatures 
in excess of 2000°K, diamond nucleation commences about foreign atom 
centres such as silicon. This begins as a semi-ionic linkage between silicon 
and carbon atoms since the range of the silicon—carbon forces exceeds 
that of the carbon-carbon forces. As further carbon atoms are attracted 
to the temporary silicon—carbon grouping however, the latter decomposes 
because of the intenser carbon-carbon linkage at very close range. 
A diamond grouping of carbon atoms is thus formed, the silicon atom being 
expelled to the surface of the crystallite where the process is repeated 
indefinitely. The silicon atoms thus reside on the surface of the crystal- 
lite where they act as a catalyst, building continual fresh layers of diamond 
on the (III) face in octahedral specimens. Because of the failure to 
achieve long continued equilibrium conditions, pressure gradients are 
frequent and relative movement between minute crystallites forces them 
into atomic proximity while they are still in active growth. If the 
crystallites are still below a certain size, because of the great strength 
of the diamond linkage, the crystallites are pulled into exact parallelism 
and in the event of the perfect close packing of ideal octahedra of all 
the same size, would give rise to the production of a perfect aggregate 
crystal of class 2 in table 1, containing only single vacant sites. In 
fact, because of the improbability of exact packing and because the 
crystallites, although parallel, will show sideways displacements, cavities 
will be formed at edge and corner intersections, the walls of the cavities 
being true interior crystal faces characteristic of the crystal habit. The 
silicon and other foreign atoms will, of course, have migrated to these 
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cavity surfaces as the surfaces of the crystallites coalesce and such foreign 
atoms therefore produce relatively little effect on the diamond lattice as 
postulated on p. 399 et seg. Ifthe number of cavities is large the resulting 
specimen will be typical of class 5, and will thus have the characteristics 
of water-white type 1 diamonds. In addition to the cavity sites between 
the crystallites, each crystallite will contain its appropriate density of 
single vacant sites. 

On the other hand, if the number of foreign atoms present for nucleation 
is very small, fewer crystallites will be present and each crystallite will 
grow to a much larger size. When these aggregate under pressure 
gradients, the individual crystallites may be too large to be orientated 
exactly under atomic forces before crystallization has ceased. Conse- 
quently the aggregate forms a mosaic block, the foreign atoms congre- 
gating at the mosaic boundaries and at the cavity sites formed as in the 
previous case. Such diamonds would have the characteristics of the 
commonest type 2 specimens which occupy class 4 in table 1. The 
relatively large size of the crystallites and the consequent relative in- 
frequency of boundaries and cavities in specimens of comparable size, 
gives these diamonds their superior properties as counters. However, 
although the cavities are infrequent they would sometimes be expected 
to be of large size when the crystallite blocks are large. This difference 
between cavity sizes in types 1 and 2 as well as in cavity density would be 
expected to lead to differences in several physical properties. For 
example, the dispersal of the foreign atoms in many small cavities will 
lead to maximum electrical and optical disturbances extending over a 
solid angle of 47, whereas with large cavities, the effective solid angle for 
the affected crystal may not exceed 27. Moreover, the abundance of 
foreign atoms which can congregate on a larger surface will lead to 
aggregation of their foreign atoms, to form minute inclusions which, pro- 
vided they do not exceed the size of the cavity, will produce much less 
optical effect such as 8 absorption, than if these atoms are dispersed. 
It may therefore be significant that Grenville-Wells and others report 
a much higher proportion of visible inclusions in type 2 specimens than 
in type 1. In the extremely rare event of perfect alignment of the large 
crystallites, the resulting highly perfect crystal will have many of the 
properties of the class 2 specimens. The failure to find the perfect class 1 
diamonds is, of course, explained as a thermodynamical consequency of 
the equilibrium production of single vacant sites, inevitably present at the 
high temperature necessary for diamond growth. There is, however, some 
variation possible in single vacant site density according to the tempera- 
ture at which the crystallites have grown and these variations give rise 
for exainple to the distinction between class 2 and 3 diamonds, the latter 
having been catastrophically cooled from a higher temperature than the 
former and thus containing a greater abundance of single vacant sites. 

The explanation of the growth of trigons on the natural faces of octahe- 
dral specimens requires considerably more detailed consideration of the 
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last stages of the growth process. These trigons possess the curious feature 
that the apices of the equilateral triangles formed by the face of the trigon, 
point towards the edges of the octahedral face of the crystal. The sides 
of the trigon are therefore not parallel to the sides of the octahedral faces 
but orientated through 60° with respect tothem. It is suggested that they 
represent the last stage of the growth process. After the aggregation of 
the crystallites, the cavities formed by the laterally displaced crystallites 
are of octahedral shape and parallel to the sides of the crystal. They will, 
however, contain the foreign atoms responsible for growth and imprisoned 
carbon atoms which have not yet crystallized. The impurity atoms will 
be largely at the edges of the crystallites since they have just completed 
the growth of the last perfect layer of carbon atoms on the crystallite 
face. Because of their location and because the atomic bombardment of 
the as yet uncrystallized carbon atoms will be more frequent in the 
corners of the cavities owing to the relative confinement of their geo- 
metrical motion, crystallization will proceed simultaneously at the edges of 
the cavity in an inwards direction as opposed to the normal outward 
growth of a crystal. The resulting cavity, as the original cavity fills 
with crystallizing material, will be orientated as are the trigons actually 
found, for the growth will occur in the directions of the lines joining the 
apices to the centroid of the original cavity. According to the nature of 
the pressure fluctuation which produced the aggregation of the crystallites, 
so the number of carbon atoms present may be deficient or in excess of 
those necessary just to fill the cavity with perfect diamond. If, together 
with the foreign atoms, the number of carbon atoms is insufficient, 
cavities will remain in the final specimen. If it is in excess, the final 
specimen will contain interstitial atoms or inclusions with the resultant 
properties discussed earlier in this article. It is clear that it is most 
improbable that the conditions for the diamond structure will persist 
until all the carbon atoms have crystallized but rather that the final stage 
will be a layer of graphite and compounds with the foreign atoms, on the 
surface of the cavity. These various possibilities are sufficient to account 
for the wide range of properties of natural diamonds. 

Approximate estimates have been made for the range of size of the 
cavities or aggregate defects in some diamonds both from electrical 
counting experiments, and from thermal conductivity data on the basis 
of phonon scattering. A wide range of values is indicated corresponding 
to defects varying from a few to a thousand atoms. 


§ 6. COMPARISON OF DIAMONDS WITH OTHER MATERIALS 


The comparison of the behaviour of defects in diamonds with those of 
defects in other crystalline materials which have been extensively studied 
needs great caution. If comparison is made with germanium and silicon, 
both of which are covalent crystals with the atoms arranged in the 
diamond structure, some difference arises from the relative simplicity of 
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the electronic shells in diamond and the much greater value of its Debye 
temperature. In addition, since the energy gap between the valence 
band and the conduction band in germanium and silicon is sufficiently 
small to allow thermal excitation into the conduction band, these materials 
are semiconductors and not insulators at ordinary temperatures. Finally, 
while artificial defects produced by nuclear radiations produce the same 
general effects as in diamond—for example, change in the electrical 
resistance, except in diamond much annealing takes place as a result of 
the radiation and from thermal agitation even at room temperature. 
While the general behaviour of the diamond-silicon-germanium group is 
therefore similar, their performance differs in important details. 

On the other hand, if comparison is made with the vast body of know- 
ledge now available on the alkali halides (Seitz 1954), the points of 
difference in behaviour are at least as marked as the points of similarity. 
This is entirely to be expected since the alkali halides are formed through 
the action of the long range forces characteristic of ionic crystals, whereas 
in diamond we have the short range forces resulting from covalent bonds. 
Thus, characteristic infra-red absorption is expected and found in the 
alkali halides whereas it should be absent in diamond. Ionic mobility 
occurs in the alkali halides but has no counterpart in diamond. The 
coloration of the alkali halides by x-radiation is explained in terms of the 
well known F-centres, which are negative ion vacancies; there is no 
direct comparison with carbon atom vacancies in diamond which shows 
no appreciable coloration with x-rays and where the vacancy captures a 
positive hole more readily than an electron. Even as crystal counters, 
while diamond is characterized by a comparable mobility for electrons 
and positive holes, in the alkali halides the positive holes are almost 
immobile. As with germanium and silicon, the vacancy defects in the 
alkali halides anneal relatively easily whereas annealing in diamond is 
negligible. Nevertheless, certain comparisons of a general nature are 
reassuring as to the validity of the interpretation of the effects of defects 
in diamonds. For example, the concept of aggregate vacant sites or 
cavities has been found essential to explain phenomena in the alkali 
halides for which the single vacant site defect has proved insufficient. 
Irradiation of lithium fluoride (Binder and Sturm 1955) by neutrons gives 
an expansion of the lattice which requires most of the defects to be those 
associated with the production of both interstitial atoms as well as vacant 
sites, as required by Benny and Champion for irradiated diamonds. 
Non-uniform irradiation of the alkali halides (Primak et al. 1955) by 
X-rays or high energy deuterons gives rise to birefringence which can be 
directly associated with a non-uniform dilatation of the lattice, a condition 
which has been suggested as responsible for birefringence in diamond on 
p. 397. On the other hand, because of the exceptionally high thermal con- 
ductivity of diamond, the production of * thermal spikes ’, postulated to 
explain radiation damage in many other materials, cannot occur (Primak 
1955). Direct comparison with the behaviour of indium antimoinde, 
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the structure of which would also reduce to the diamond lattice if only 
one type of atom were present (Cleland and Crawford 1954) is difficult 
for similar reasons to those given above for other materials. 


§ 7. CONCLUSIONS 


If the broad outlines of the qualitative behaviour of diamonds therefore 
appear to be adequately accounted for by defect theory, a rigorous 
quantitative comparison of theory and experiment has scarcely begun. 
Various experimental measurements have been made of the mobilities of 
electrons and holes in different diamonds and the experimental values lie 
between 1000 and 2000 cm/sec/volt/em (Redfield 1954). This is an order of 
magnitude larger than the value calculated by Seitz but it is not yet certain 
whether the discrepancy lies in a wrong assumption as to the shape of the 
energy surfaces, an incorrect value of the deformation potential, or to 
the effective mass m* of the electrons being considerably less than the 
free electron mass. Champion and Dale (1956) have made fairly extensive 
measurements on the effect of light and temperature change on the elec- 
trical conductivity of diamonds and they have shown that the mobility of 
the charge carriers varies as 7'—9/? as required by existing theory. Although 
some other work exists (Seitz 1949, Dienes and Kleinman 1953, Fletcher 
and Brown 1953) detailed theoretical calculations are awaited on most of 
the optical and electrical phenomena for comparison with accurately 
measured experimental values of the corresponding quantities when these 
have been made. In this connection a start has been made on the study 
of the paramagnetic resonance produced in neutron-irradiated diamonds 
(Griffiths et al. 1954). The field of radiation damage is exceptionally 
promising, for the fact that the bombardment of water-white diamonds by 
high energy electrons produces a blue coloration, whereas neutron bom- 
bardment colours the specimens green shows that the energy levels 
created in the forbidden gap are capable of wide and controllable variation. 
The progressive decrease in resistance on irradiation (and the wide varia- 
tions in the magnitude of this property for different natural specimens) 
shows that diamond may be obtained in forms which show a continuous 
range of properties from the highest insulators, through semiconductors 
tc conductors. Moreover, in contrast with germanium and the metals 
(Glen 1955), the defects which give rise to these properties in diamond are 
exceptionally stable with respect to temperature. The fact that the most 
perfect specimens of classes 1 and 2 which exhibit the properties of perfect 
covalent bonding and short range forces, can be degraded by one type of 
defect or another to produce islands of long range forces similar to those 
existing throughout ionic crystals, holds out great hope for a systematic 
and controlled study of the two types of force characteristic of the solid 
state. The study of diamond is therefore of intense interest to both 
experimental and theoretical physicists, to geophysicists because of the 
light thrown on the conditions of natural growth, to the electrical engineer 
because of the extraordinary range of its possible electrical properties 
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and to the chemist because of its dependence on the behaviour of carbon- 
carbon bonds. It is, of course, of interest to the nuclear physicist as a 
detector or counter or radiations, while the permanence of the defects 
produced at high flux indicates an obvious possibility as an integrating 
flux recorder in the field of atomic energy. 
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SUMMARY 


The work to date on the polaron ground state is collated and summarized, 

When the coupling is strong, simple considerations can be applied 
enabling one to relate the results of various authors. These considerations 
show that the rest energy is of the form (—aa?—b—c+O(a-?))ia where 
ais to be calculated using the Hartree approximation and is approximately 
equal to 0-1, 6 is approximately equal to 2 and appears as a second order 
perturbation expression arising from the zero point vibrations, and c is 
the energy of localization incurred through using a Hartree approxima- 
tion and is equal to 3. 


§ 1. INTRODUCTION 


THIS review is concerned with the properties of a slow electron moving in 
a conduction band of an ionic crystal. Such an electron is continually 
in interaction with its surroundings, which in good approximation can be 
described in terms of a continuous (macroscopic) polarization field. The 
electron moves in the potential produced by the polarization charges, 
while the polarization in turn is influenced by the Coulomb field of the 
electron. The polarization field can undergo harmonic oscillations at a 
certain frequency w, so that its energy can only be changed by multiples of 
hw. For all values of w a full quantal treatment of the problem is 
essential. We have here a quantum field theory in which a low energy 
electron should be pictured surrounded by its self-induced polarization 
field. This assembly of electron and field in equilibrium with each other 
is usually called a polaron. 

This is the simplest known physical example of a consistent non 
relativistic theory of an interaction between a particle and a quantized 
field, and is interesting from a methodological point of view, quite apart 
from its physical implications. Here we shall be concerned mainly with 
the mathematical problem posed by the polaron, confining our attention 
to the wave function, energy, and effective mass. The mobility of a 
polaron in a thermally excited polarization field will not be discussed. 

An electron in an ionic crystal experiences quite a variety of electro- 
static forces. First in importance among these is the periodic electrostatic 
field which alone would act on the electron if the crystal lattice were 
rigid. This field is responsible for the appearance of energy bands. To 
a fair approximation the energy of the electron in an energy level in a 
given band is usually proportional, apart from a constant term, to the 
square of the wave number, so that the energy spectrum for the band can 
be simply described in terms of an effective electron mass (* crystal mass ’) 
different from the mass of a free electron. The remaining electrostatic 
forces are due to departures from perfect periodicity. Such departures 
can arise either through the presence of defects and impurities in the 
lattice, or from temporary displacements of the constituent particles of 
the lattice from their mean positions. Here we shall be concerned only 
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with the latter possibility. The displacements may be induced thermally, 
or may be produced directly by the Coulomb field of the electron. If 
the displacements are uniform over several lattice cells the electric field 
accompanying them can be described in terms of a space-averaged electric 
polarization field P. Such a macroscopic description is adequate for the 
present problem, the reason being that the larger part of the interaction 
between the electron and the lattice polarization concerns lattice cells 
some distance from the electron. This simplifying feature is a conse- 
quence of the long range nature of the Coulomb interaction, combined 
with the quantal limitations which are set on the position of the electron 
by the uncertainty principle. 

Two main types of polarization can be distinguished. Firstly, there is 
the polarization produced in the electronic structure of the ions of the 
lattice. This we call the electronic polarization Pe. The oscillation 
frequencies concerned with Pe lie in the ultra-violet and may be taken as 
infinite for the present problem. Secondly, the ions of the positive and. 
negative lattices may oscillate against each other, producing what we call 
the optical ionic polarization Pj. These oscillations occur with frequen- 
cies which lie in the infra-red. To a good approximation the frequency w 
of a longitudinal optical ionic oscillation in an optically isotropic crystal 
is independent of the wave number of the oscillation. In addition to 
electronic oscillations and optical ionic oscillations there are ionic oscilla- 
tions in which the positive and negative lattices vibrate with each other. 
These acoustical oscillations are unimportant for the present problem, 
because for obvious reasons the accompanying electric polarization is very 
small. 

As we have noted, the oscillation frequencies associated with the 
electronic polarization are very high, and therefore this polarization can at 
once be solved in terms of a suitable dielectric constant. We are thus 
left essentially with the problem of an electron interacting with a macro- 
scopic quantum field with a natural vibration frequency w. 

The properties of a polaron are determined by the magnitude of the 
oscillation frequency w. Here, to make the statement more precise, we 
may imagine w to be varied by varying the masses of the ions of the 
lattice, keeping the other parameters of the system constant. 

If w is very small the electron can follow the quantal zero point fluctua- 
tions of the polarization field adiabatically. The number of polarization 
quanta excited by the electron will then be very large, since the quantum 
of energy hw is very small. Under these circumstances the polarization 
field can to a good approximation be regarded as classical. If the 
electron is now confined within a volume of dimensions /? its kinetic 
energy will be of order 4?/2ml*, and the field will be that due to a charge 
e spread out over /°. Thus the potential energy will be of order —¢/e*]. 
where «* is a dielectric constant. Minimizing the sum of the above 
energies gives /=/*«*/me? and a polaron energy of order —e4m/2h%«*®, 
apart from numerical factors. This static approximation can be improved 
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by taking into account the zero point fluctuations of the field. This 
gives the low frequency adiabatic approximation which has been formu- 
lated in a general and rigourous manner by Bogoljubow (1949, 1950) and 
Tjablikow (1951, 1953, 1954). It has been found possible to study and 
understand the salient features of the low frequency adiabatic approxima- 
tion without the elaborate mathematical machinery heretofore used. The 
simplified approach which is adopted here is however probably limited in 
its scope to the calculation of energy eigenvalues. 

The static and low frequency approximations fail when the period of 
the polarization oscillations becomes comparable to or less than the time 
taken by the electron to traverse the distance J, i.e. when 

lw S mi?/h=e**h3/me4. 
(Let it be emphasized that these criteria are qualitative rather than 
quantitative.) For most ionic crystals a more exact statement of the 
above criterion rules out the low frequency approximations (see Frohlich 
1954). If however an electron is bound in a lattice defect it is forced 
to move quickly and the low frequency approximations then become valid 
for real crystals (see Simpson 1955). 

The opposite extreme of a high oscillation frequency is mathematically 
well understood, though involving quantal features which make a pictorial 
description difficult. We note that when the quantum of energy hw 
is large there will not be many quanta in the polarization field, which will 
therefore exhibit large fluctuations. The field tends to follow the electron 
adiabatically since w is large. It cannot do this perfectly however, for a 
complete correlation between electron and field would be compatible with 
the uncertainty principle only if the relative momentum between electron 
and field were very large. In this case there would be a rapid break-up 
to some configuration with less spatial definition. We expect therefore 
some critical length d of quantal origin to play an important part in the 
high frequency approximation. The field at distances greater than d 
from the electron will be essentially a classical Coulomb field, but at 
smaller distances the field will be more or less uniform apart from large 
fluctuations. The interaction energy will thus be of order —e?/e*d, 
The polarization quanta important in this system will have wavelengths 
greater than d. To determine d we use the consideration that a lattice 
polarization wave cannot follow the electron if the distance travelled by 
the electron during the time of one oscillation is greater than the wave- 
length of the polarization. Since, by definition, the critical wavelength 
concerned is d, we must have v/w Sd, where v is the velocity of the electron. 
While the mean square value of v may be small, v will sometimes show 
fluctuations of order h/md, arising from emission or absorption of a lattice 
quantum. When such a fluctuation in v occurs the polarization field can 
re-adjust itself if h/mdw = d, ie. if d= /(h/mw). Substitution of this 
value for d gives a potential energy of order =i e* d= —e24/(muw/he*?), 
Assuming for the moment that the total energy is of the same order of 
magnitude we expect that between every time 1/w during which the 
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electron’s velocity is 4/md there will be a time h{e?\/(mw/he*?)} —* during 
which the electron’s velocity is approximately zero. Thus the mean kinetic 
energy of the electron will be of order 


m A\?(1 1/fé2 mo 
2 °\md w b\e* h 


which is of the same order as the potential energy. Similar considera- 
tions apply to the kinetic energy of the polarization field. Thus the total 
energy in the high frequency case is of order —e?4/(ma/he*?). 

The high frequency approximation can be developed in a form suitable 
for the treatment of most ionic crystals (see Lee et al. 1953). 

In between the extremes of high and low frequency is a region whose 
properties are as yet unknown. The transition between high and low 
frequency regions could conceivably be accompanied by some kind of 
resonance behaviour, the electron vibrating in sympathy with the lattice. 
Recently attempts have been made by Hohler (1954, 1955 a, 1955 b) and 
Feynman (1955) to bridge the gap between the low and high frequency 
approximations. The relationship between the methods of these authors 
and the low and high frequency approximations is discussed in the 
present review. These methods give a continuous transition between the 
low and high frequency regions. No resonance phenomena have been 
noticed as yet in the intermediate region, but in any case it is not clear 
whether this would be significant, for at present we have no criteria by 
which to judge the adequacy of any method for this region. In the two 
limits Feynman’s method gives, however, an excellent account of the 
actual state of affairs. 

Furthermore, Feynman’s calculations indicate that the anomalously 
large effective mass obtained in the low frequency limit must be corrected 
by higher order terms which become very important as w is increased. 
This leads to the conclusion that the domain of validity of the low 
frequency adiabatic approximation is more restricted than would be 
indicated by considerations of polaron binding energy alone. 


§2. THE HAMILTONIAN 
2.1. Constitutional Equations for an Optically Isotropic Ionic Crystal 

The Hamiltonian, equations of motion, and quantization rules for the 
polaron problem have been formulated elsewhere (for example, see 
Frohlich 1949, 1954, Born and Huang 1954). However, discussion of the 
physical basis of this theory should be helpful here, casting some light 
on the meaning of the approximations to be made later in solving the 
eigenvalue problem posed by the quantized Hamiltonian. 

As mentioned in $1, we are dealing with macroscopic electric and 
polarization fields Eand P. We have 

P= P.+ Pi, are (2.1) 

where Pe is due to deformation of electron clouds and P; to displacement 
of ions. 
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Throughout we shall use unrationalized Gaussian units in formulating the 
problem, thus D=E-+- 47 P, divD=4rp, etc. 

In an optically isotropic medium the effective field producing the 
polarization may take the familiar form 


Eg =E+ 347 P=D—347 PP, . 2 ww... (8.2) 
and for our purposes the above equation will serve as an illustration, 
though the numerical factors should not be taken literally f. 

The equation of motion for P; will be of the form 


dor Pi dary? Pi= vB jEort SR i ete cota iar) 


The equation for Pe is simpler, since the corresponding natural vibration 
frequencies are so high (ultra-violet) that Pe will follow E,, adiabatically, 
thus 


4a Pe=feE gr . . : . ° ° . ° (2.4) 


The force term on the r.h.s. of the oscillator eqn. (2.3) includes contribu- 
tions from Pj, so that it is quite clear that the oscillation frequency is not 
just v. In fact the force v?6;Eer; takes account mainly of interactions 
with distant ions and with external fields, while the force —47p? P; takes 
account mainly of interactions with nearest neighbours, due to inter- 
penetration of electron clouds. . 

The microscopic parameters /;, Be and v can be expressed in terms of 
the static dielectric constant «, the high frequency dielectric constant 
€. (here are implied frequencies high compared to the infra-red absorption 
frequency but low compared to the ultra-violet absorption frequencies), 
and the oscillation frequency w for longitudinal optical vibrations of the 
lattice, as we shall presently see. 

Inserting (2.2) into (2.4) gives 

4 Po(1+28e)=BeD—2Bheda Pi. . . . «. . (2.5) 
If the impressed frequency is high then Pj; will be unable to follow the 
oscillations and will be negligible, while 47 P, will be equal toD(1—1/e,,), 
hence 


1—lfe,,.=fe/(1+3Pe). . . . . . . (2.6) 
Thus for all frequencies (below the ultra-violet) we have 
4 Pe=(1— l/c.) D—#(1— le.) 4a Pi. be ED Ae SER 


Following Frohlich (1949, 1954) we may now define optical and infra-red 
polarizations as follows, the utility of this procedure being clear from 
(2.7), 

At lee } Asch eae Oe) 

dor P,,=47 P—(1—1/e..)D 

=. ; 
2.2) is valid for crystals in which the surroundings of each ion 

ae etrakoaat symmetry. A discussion of the general case may be found 
in Born and Huang (1954). 
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Quite evidently P,, obeys an equation of motion of the form, (from (2.3), 
(2.2) and (2.8) ), 
Am Pit 4arw? PP, —w'D. . . . . . . (2.9) 

The precise connection of 8 and w with fj, Be and v depends on the effective 
field Es; producing ionic polarization. Usually this will not be repre- 
sented correctly by (2.2), but it is now clear that the form of eqn. (2.9) 
will not be affected, since Eeyy. is always a linear function of D, Pe and 
P, Evidently (2.9) is a good starting point for a phenomenological 
approach. 

If we consider a static field we have 47 P=(1—1/e)D, 47 P,,—/D, 
47 Po>=(1-—1/e,,)D, hence 


p=le.— Yeo vo Oe. eee 
The constitutional equations for the theory are thus 
P= P,+ P,,, 
with 4 Po=(1—I1/e.,)D, a aa) 
and 4n( Pi,-+w? Pi) =w(1/e.—1/e)D. | 


The adjunction of (2.11) with Maxwell’s equations shows transverse 
plane wave vibrations (div P,,—0) for all frequencies excepting those 
within a band of perfect reflection extending between wy /(e,,/e) and w, 
and also longitudinal vibrations (curl P;,—0) at the fixed frequency w. 
The frequency w can be determined optically by studying the trans- 
mission of infra-red rays through crystals. Born and Huang (1954) show 
theoretically that for sufficiently thin films the transmission is least at a 
frequency w+/(e.,/e). This affords an unambiguous determination of w. 


2.2. The Total Energy 


The total energy of the system, conduction electrons+ crystal, is given 
by standard electromagnetic theory as 


s 1 . . 
Ydme 2+ reall (E.D+H.B)drdt. 2. . (2.12) 
The time integrations in (2.12) must of course be carried out along a path 
satisfying the equations of motion. By (2.11) we have 
E—D—4r P,,— 47 P, 
= Die gd Postmen baie. Mae alae 


Hence 


1 ‘ 
far ch ae FREE 
| E.Da%rar— 


x | yorar—J[ P,.Dd'rd. . . . . (214) 


The second term on the r.h.s. of (2.14) may be partially integrated with 
respect to time, this leaves | f P,, . Dd*rdé to be evaluated. The integrand 
P;..D can be expressed as a perfect differential by eliminating D in 
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favour of P,, and P,.. using the equation of motion (2.11). Thus the 
total energy # can be written in the following form, 


. 1 i 
H = Y4mre2+ pe fa D? an P,,.D dr 
4tree,, [ s 
a oor BA( Pip/w)?+ Pi?) dr 
l . 
pose 3r dt. ES TS ers ees 
+z-[] HH. Barras : (2.15) 


2.3. Neglect of Magnetic Effects 


We now specialize to one slow electron without any free electromagnetic 
waves or superimposed external field. In this approximation we may 
neglect the magnetic terms in (2.15) and for D use the approximate 
expression 

eee ee eer OTS) 
| Saran 
The term in D? in (2.15) now gives only a constant self energy and may be 
omitted. The fact that this self energy differs by an infinite amount from 
the corresponding energy in vacuo is due to treating the lattice as a 
continuum. 

Consistent with the approximation we are making we may consider 
only a longitudinal field P;,, thus 

dP ee i wat eee, eRe (911-7) 
For it is evident from (2.11) that in the approximation (2.16) the trans- 
verse polarization waves do not interact with the electron, directly or 
indirectly. 

The total energy now becomes, omitting the electron’s constant self 
energy, 

H = Pu Pat e®,,( re) 


oh fa {(vP,/o)?+(v B,)}dr. |. (2.18) 
4i(e—e,,) 
The equations of motion are 
mY =eE(r = ==e¥ Dil at), OA PTR kg Rei (2.19) 
ee ae | 1 Bau 
@,,.+w?@, = —ew? (=-:) [r—ra] . . . . (2.20) 


2.4. Quantization 


Our next task is to construct a quantal version of (2.18)—(2.20). We 
have to impose commutation relations between the variables re,, rez, 


@,(r), and @,,(r’), all these variables being taken at the same time ?. 
ir ? 
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Let us suppose that the only non-zero commutators between the above 
variables are 

[revs Me iomera: 

2A oe, eof ee I 

[D,,.(r), P(r )} ihe (= = *) [r—r'| eS re ye (2.21) 
It is easily shown that these commutation rules will be preserved during 
the motion (2.19) and (2.20), and that (2.19) and (2.20) follow in Hamil- 
tonian form with the total energy 4% (2.18) as Hamiltonian. (For 
example ite) = [fe1, % |). We must conclude that (2.21) are the correct 


commutation rules. They may of course be derived also by the standard 
methods of field theory (ef. Fréhlich 1954). 


2.5. Dimensionless Formulation of the Theory 

Finally, for convenience in calculation, we may Fourier analyse 9@;, 
and use dimensionless variables. Thus following Fréhlich (1954) we 
introduce 4/(h/2mw) and hw as units of length and energy respectively, 
and define 

Xo = Fe. V (2mw/h), etc., 
then : Py br 
me) = —ih 0/0rey = —iv/(2mwh)d/Oxe1, 


We express the Fourier analysis of ®, by the following expansions, 


: Atw(e—e,)f/mwh\'2) 1 we 
Dyle)mi, | {Mea 2 ) Lys Gyte- ; —b.e a, > 


v 


: 47 w(<«—e.,,) /mwh\12 1 7 wz 
d(r——o, [{ —eal(nen bys Gyre ot OG oe 


= ¥, 


(2.23) 


Here the summations are extended over wave vectors v such that the 
functions exp (—iv . x) form a complete set in a large box whose volume in 
the dimensionless variables is S. The wave number v—0 is excluded 
however, since it has no relevance to the dynamics of the problem, 
contributing at most only to surface effects. 

Comparison of (2.23) and (2.21) leads to the relations 


by, Ow ]=0, ete. Ne ; . - (2.24) 


pie : : : : : ; : : : 
The Hamiltonian, In units of hw, may now be written in the following 
dimensionless form, (writing just x for xe} ) 


4 


2 jftis==— ON OXI D (by bert-d) 


f4ra\"2 _ J : 
4 (=) > 5(d'e iv.x bye), i, ne (2.25) 
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where the dimensionless coupling constant « is given by 


peer rae! m 
Pen. eae 2wh3 } * 


The zero point energy }4/w will be omitted in future, and we shall always 


understand V to mean 0/0x=0/0x,, . 


Evidently we have above a set of quantized harmonic oscillators with 
energy quanta fw, in linear interaction with a particle. 


2.6. Conservation of Momentum 


The total wave vector M commutes with the total energy # and is 
therefore a constant of the motion : 


M=—iV+),b,tbv, iidM/di=[M, #]=0. . . (2.27) 
Evidently we may seek wave functions which diagonalize both # 
and M. 

Since any momentum lost by the electron is gained by the crystal 
it follows, by (2.27), that the momentum of the latter can differ from 
YV(2mwh) bytbyv by at most a time independent constant. Thus it 
is quite consistent to regard },1/(2mwh)b,tbv as a momentum carried 
by the polarization waves.t 

In considering eigenstates | %,) of the operator M with eigenvalue k, 
one may utilize the equation 


Bel V ileleg) = (Kae 0. SV) dad) oo) ani be asp ll 2.28) 
to eliminate the operator V. Then, defining new canonical variables 
bebe exp (==1V x), ete. 1 a ck. A(2.29) 


one obtains for these states the effective Hamiltonian #(k) given 
belowt: 
HA (k)/ho =k?+ > b,7'b,'(1—2v . k+-v?) 


_f4ra\ V2 1 AG] i 
+i(F) 2 5,6 by) 
bb 8 We eee 230) 


+ It must be noted however that the present idealized model takes no account 
of density changes and of the fact that the restoring forces really depend only 
on interparticle distances, not on displacements from a mean position. These 
neglected features would have to be included to give a description of the 
physical mechanism of momentum transfer. On the present model the only way 
a lattice oscillation can carry momentum is to have associated a bodily movement 
of the whole crystal (cf. Fréhlich 1954). This unrealistic feature could be 
remedied by an exact treatment taking account of the above mentioned effects. 

+ The passage from eqn. (2.25) to (2.31) may be regarded as a canonical trans- 
formation (Zienau 1953). 
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The effective mass m* (in ordinary units) is defined (using Frohlich’s 
dimensionless units) by the equation 


E(k)=E(0)+(m/m*)k?hw+O(k*)  . . « « (2.31) 


where £ denotes the eigenvalue of #. 


§3. THe Dynamic or Hier FREQUENCY APPROXIMATION|—SMALL « 
3.1. Displaced Oscillator Wave Functions 


If « (eqn. (2.26)) is small, then only a few virtual quanta will be excited 
and the recoil velocity of the electron in the polaron will not be large, 
except for occasional fluctuations. 

On the other hand, according to eqn. (2.26) a small value of « corre- 
sponds, other things being constant, to a high lattice frequency w. Thus 
most of the oscillators will follow the electron adiabatically, only those 
with very short wavelengths will be unable to do this. Therefore the 
electron will be closely accompanied on its peregrinations through the 
lattice by the classical polarization field of a point charge, excepting 
at short distances where the polarization will be less than the classical 
polarization. The critical length concerned was seen in $1 to be of 
order 4/(h/mw), which does indeed go progressively smaller as w is 
increased, thus confirming these considerations. Frdhlich (1954) has 
referred to this state of affairs as the dynamic case, since the response of 
the polarization field at short distances is governed by its dynamical 
properties. 

It is interesting to compare the polaron problem with the case of an 
electron in vacuo interacting with the electromagnetic field. The critical 
length k.~! follows from setting ke! equal to 4/(h/ma(ke)). Now the 
dispersion law in vacuo is w(k)==ck where c is the velocity of light, so that 
the critical length k.~! is just the Compton wavelength hj/mc. This 
length is so small (1/137 of the Bohr radius) that the interactions between 
different electrons can usually be adequately treated by the static Coulomb 
field approximation. But if we wished to calculate the self energy of an 
electron, then we should be concerned primarily with the state of affairs 
at distances less than the critical length. Dynamical effects would then 
become important. 

Now the polaron problem essentially concerns the self energy of an 
electron in a field. Thus it can be understood that dynamical effects 
are quite important even though w be so large that a static approximation 
is adequate for treating the interaction between different polarons. 


Se scene sppcnesgpesnn—pecesens 


+ The dynamic method is often referred to as the intermediate coupling 
method because of the close connection with Tomonaga’s (1947) well known 
intermediate coupling approximation. Here, however, it is better to reserve 
the terminology * intermediate coupling’ for a range of coupling constants in 
which neither of the two approximations of § § 3, 4 and 5 is adequate. 
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As we have just noted, the main part of the polarization field in the 
high frequency case follows the electron adiabatically. Now it is well 
known (Tomonaga 1947, Glauber 1951, etc.) that quasi-static classical 
fields should be represented in quantum mechanics by the ground states 
of displaced field oscillators.+ 
Here the displacements of the oscillators must be phased to give a polar- 
ization state centred on the electron. Thus the correct procedure in this 
approximation is to minimize the expectation value of the effective 
Hamiltonian (eqn. (2.30)) by using displaced oscillator wave functions 
for the oscillator variables b,’. This was done by Gurari (1953) and 


independently by Lee and Pines (1952), and Tjablikow (1952, 1953 b), 
with conspicuous success. 


3.2. Quantitative Features of the Dynamic Case 


To see the essential features of the dynamic method it will be sufficient 
to consider the case where the total wave number k is zero. Neglecting 
for the moment the last term in the effective Hamiltonian #(0), eqn. 
(2.30), the remainder of #(0) represents a set of displaced oscillators 
with displacements d, where + 


pace Arra\ V2 1 ; 
--i(=) raat)’ Pile Hie Penn (cd) 


and the lowest energy of these oscillators is + 
—)>, (l+0?) dtd, fiw=—ahw. . . . . . (3.2) 


The dynamic effects are seen in the factor (1-+-v?)~ in eqn. (3.1). 
Consideration of spherical symmetry shows that the last term in 
H (0) takes on zero expectation value when the wave function required to 
give eqn. (3.2) is used. Thus eqn. (3.2) sets a variational upper bound to 
the energy for k=0. To obtain the best expectation value of #(k) 
for arbitrary k it is necessary to set up a general displaced oscillator wave 
function (Lee and Pines 1952, Gurari 1953, Frohlich 1954), thus + 


| %.) =exp(ik. x) exp {dy dy(k) by*’} | 0) exp{—22dv | dv(k) | ®}. 


+The Hamiltonian for an oscillator with a complex displacement d is of the 
form b+b—d+b—b+td=(b+—d*) (b—d)—d+d. Since b+—d+ and b—d are 
canonical variables having all the defining properties (commutation rules and 
reality conditions) obeyed by b* and 6 it follows from the usual theory of the 
harmonic oscillator that the eigenvalues of the displaced Hamiltonian are 
—d+d+n, where n=0, 1, 2,3... The normalized lowest eigenstate (eigenvalue 
—d+d) is exp (db+)]0) exp (—4d*d) where [0) is the normalized lowest 
eigenstate for the undisplaced oscillator, i.e. b]0)=0. The mean number of 
quanta in the displaced lowest eigenstate is d+d, and b and b* take the expecta- 
tion values d and d* respectively. 
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and then to adjust the d,(k). This gives, for small k, 


E(k)<{—a+h?(1+a/6)+O(k4)}iw, . . . . (3-4) 
so that the effective mass is, according to eqn. (2.31), 
m*=m(1-+a/6). << ee WE ea ay 


Equation (3.4) extends the result of perturbation theory for small k and 
small « to larger values of « f. 


3.3. The Range of Validity of the High Frequency Approximation 

The validity of the dynamic method may be assessed (as Gurari 
intended), by treating the remaining non-diagonal part of the Hamiltonian 
as a small perturbation (Lee et al. 1953, Zienau 1953, Haga 1954). Such 
treatments show that the solution is indeed a good approximation to the 
true state of affairs for « <6 (see tables 1 and 2) t. The criterion used is 
that the corrections found by this perturbation calculation shall be small. 

For further details of the various calculations discussed or mentioned 
in this section the reader may be referred to the review article by 
Frohlich (1954). 

We should mention at this point that Low and Pines (1955) have 
recently recalculated the low temperature mobility in the high frequency 
approximation. 


3.4. Breakdown of the High Frequency Approximation—Correlations 


The following points are important in considering the validity of the 
dynamic method. 

(i) An essential feature of the displaced oscillator wave function (3.3) 
is that it allows for many quanta to be present—the necessity of this 
followed from the variational method used by Frdéhlich et al. (1950). 

(ii) The dynamic method reproduces the results of perturbation theory 
(to order «) in the region where perturbation theory is applicable (« <1). 
This can be understood either from the high frequency considerations 
advanced in § 3.1; or more simply from the fact that the only term in 
H (k) which makes the dynamic method not exact for k=0 is the last 
term in eqn. (2.30), and to bring this term into play the coupling must be 
sufficiently strong to bring at least two virtual quanta into the field. 

(iii) The recoil energy of the electron appears in eqn. (2.30) as a sum of 
two terms, thus 


fiw Dybyt'b v2 + Tia Duby Dy Dy'by'V Wo... (3.6) 


+ The method breaks down when H(k) is large enough to permit emission of 
a real quantum, also for a>6. 

{It so happens that the method of Lee et al. also gives exactly the same 
results for the energy (but not of course for the wave funetion) as would be 
obtained by perturbation theory to order o2. Therefore not only is it good 
oe a<6, it is also exact to order a2, in the sense of conventional ‘perturbation 

eory. 


Polaron Rest Energy and Effective Mass 425 


The second of the above terms involves quanta correlated through the 
factor v.w, and in future this term will be called the correlation recoil 
energy. As noted in § 3.2. above, this energy is non-diagonal in the 
displaced oscillator solution for k=O, since the latter does not allow for 
any correlation between the various quanta in the field. For sufficiently 
large « the considerations of § 3.1 on the adiabatic nature of the self- 
field break down ; it is now evident that we may ascribe this to the 
increasing importance of the correlation recoil energy term. Thus for 
large « it becomes desirable to use wave functions designed to be large 
only for configurations in which the correlation part of the recoil energy 
becomes negative. 

If «= 6, the non-diagonal part of #(k), referred to a complete set 
of displaced oscillator wave functions, may be treated as a small perturba- 
tion (Lee et al. 1953, Zienau 1953). According to tables 1 and 2 this gives 
for «=5 a correction to the rest energy of about 7°% and a correction to the 
effective mass of about 27%.+ This is quite satisfactory so far as the 
rest energy is concerned but casts some doubt on the validity of the 
calculation of the effective mass for «>6 or thereabouts. Similar con- 
clusions may be drawn from a slightly different set of calculations by 
Haga (1954). t 

Lee and Pines (1953) have treated the recoil variationally by introducing 
quanta in P as well as 8 states, but this approach is not as effective as one 
could have hoped. For «=5 the results differ from those of Lee eé al. 
by only 5% though of course they have the extra validity of a variational 
approach. For «>10 the method is inferior to the low frequency static 
approximation to be described in the next sections. || 

From the preceding remarks it is clear that the Lee et al. (1953) perturba- 
tion theory extension of the high frequency method of Gurari and Lee 
and Pines is quite adequate for «<6. This condition is fulfilled for 
most substances (Frohlich 1954). The known exceptions lie closer to the 
high than to the low frequency region. 


+ The author is indebted to Dr. G. Hoéhler for bringing to his notice the 
omission of a factor 2 in eqns. (4.2) and (4.3) of Lee et al. (1953). All results 
quoted in the present paper have been corrected for this error. 

t The calculations of Lee et al., and Zienau, are based on a set of displaced 
oscillator functions chosen so that the lowest displaced oscillator state minimizes 
the expectation value of # (k). Haga, on the other hand, always takes the 
displaced oscillator states appropriate to the Hamiltonian # (0). This facili- 
tates the calculations and is altogether adequate for the small values of k 
considered. However, Haga’s division of the Hamiltonian # (k) into per- 
turbed and unperturbed parts differs from that adopted in the other works 
mentioned, and seems less satisfactory since the perturbation term taken is 
partly diagonal in the representation by displaced oscillator functions. Haga’s 
method, unlike that of Lee et al., and Zienau, does not reduce to perturbation 
theory to order a” when « is small. 

|| For «=15 and no cut-off in the wave number, Lee and Pines obtain 
—17-56 iw for the energy, whereas the low frequency static method gives 
—23-85 ho. 
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$4. Toe LanpAu-PEKkaR STATIC APPROXIMATION—THE Low 
Frequency ApiaBatic Limit—LarGE « 


4.1. The Adiabatic Principle 


The case of large « corresponds, other things being constant, to a small 
lattice frequency « (see eqn. (2.26)). This suggests that the electron is able 
very quickly to settle down into equilibrium with the polarization field 
as it appears at a given instant. That is, the electron can follow the 
polarization changes adiabatically +, occupying always the lowest energy 
state in the polarization potential. 


4.2. Neglect of Zero-Point Fluctuations 


[f one now neglects zero-point fluctuations of the lattice field in which 
the electron moves adiabatically, then the electron will always remain in 
the same state, i.e. a Hartree wave function will be indicated. One would 
suppose that for sufficiently strong coupling the zero-point fluctuations 
would in any case be unimportant. This is the justification for the use of 
the Hartree method in the approximate treatment of the adiabatic 
problem posed by the Hamiltonian for large «. The neglect of zero-point 
fluctuations ought to be equivalent to treating the lattice polarization 
classically (Pekar { 1954, Frohlich { 1954). The dynamical properties of 
the lattice are not involved in this approximation, which may therefore 
be referred to as the static case (Frohlich 1954). 

While in fact the classical treatment of the lattice polarization leads to 
correct results it is desirable to treat the problem in a quantum mechanical 
manner, so as to put the calculations on a firmer foundation. For one 
can never be really justified in treating the lattice as a classical system. 
Pekar (1954) and Froéhlich (1954) have already given a quantal calculation 
of the rest energy of a polaron. In the following section we extend their 
calculations to include a quantal treatment of the effective mass in Hartree 
approximation. 

The Hartree wave function is of the form 


| P)) = Q(x—y)®[b,* exp (—iv.y)] | 0), Baktived od 8 
with 


[1 2e-y) Pa%2=1, (0[O*Joy=1. . . . (4.2) 


The expression (4.1) is to represent, approximately, a polaron in the 
neighbourhood of the space point y. The factor Q (x—y) is the electron’s 
wave function, while the factor ®[b, + exp (—iv. y) ] | 0) represents a 
lattice state in which the polarization is centred on the point y. The 
localization of the polaron about y has no physical significance, but is 


SN ee eee es 


+ For small « on the other hand the polarization tries to follow the electron 
adiabatically. 


} Earlier references are given here, 


Polaron Rest Energy and Effective Mass 427 


brought in simply as a necessary consequence of using the Hartree 
approximation. 


4.3. A Variational Principle for the Effective Mass 


Clearly one cannot diagonalize the total wave number M (eqn. (2.27) 
while using a localized wave function of the form (4.1). Thus the usual 
way of formulating the effective mass problem (see eqn. (2.31)) is no longer 
appropriate. For the usual method puts the emphasis on the kinetic 
energy as a function of the wave number. To treat the problem in Hartree 
approximation we must instead consider a moving ‘wave packet ’ 
approximately satisfying the time dependent Schroedinger equation, 
so considering the kinetic energy as a function of the velocity of the wave 
packet without diagonalizing the wave number; thus one has simply 
to consider a normalized time dependent wave function designed to 
describe a state of affairs such as represented by eqn. (4.1), moving with 
constant velocity V. 

That is, one considers a time dependent wave function of the form 

| xv(t))= exp (—iM . Vat) | |) exp (—iAt/h) = | 4va1) exp (—iAt/h) 

(4.3) 
where, in applying the method to the polaron problem, | Y,) will be 
taken as defined by eqns. (4.1) and (4.2), with both 2 and ® depending 
on V. For later convenience the number V appearing in (4.3) has been 
taken dimensionless, thus V=dx/d(wt). Using (2.22), it is readily seen 
that the velocity in ordinary units will be V\/(hw/2m). 

Clearly | yy (¢)) should be chosen to be an approximate solution of the 
time dependent Schroedinger equation. There is no clear-cut criterion 
by which one may judge the goodness of an approximate solution of the 
time dependent Schroedinger equation. One criterion which springs 
most readily to mind is that the following positive definite expression be 
made as small as possible. 


<— —> 
Cxvlt) {ind /er— #7 }fiha/8t—} | yw)... (44) 
Substituting eqn. (4.3) into (4.4) and minimizing first with respect to 
shows that we should choose 
A= CP, | —M* Vii, | YY): ec eas 5 5)) 
With the above value for A the energy is given without ambiguity as the 
expectation value of the Hamiltonian, 
E(V)={xvl(t) | ihe/at | xv(¢)) 
=<xv(t) | # | xv(t)) sta ba, wea) 
=(P,, | # | %,,) 
while the expression (4.4), which is to be minimized, reduces to 
CH | (G—M . Vii)? | Py) — (CY, | @—M . View | %y,))?, As 
with, of course, Gly | ty =e 
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The variational principle (4.7) can be applied to any state with velocity 
V. We therefore need another criterion to ensure that we consider the 
correct state. Now all exact solutions of the principle (4.7) are eigenstates 
of #—M.Viw. This suggests that we try to obtain the lowest eigenstate 
of the operator #—M.Viw. For this lowest eigenstate will satisfy 
the principle (4.7) exactly, and will also have the required correspondence 
with the lowest eigenstate of #. Any approximate solution for the 
lowest eigenstate of #—M . Via will be also an approximate solution, 
in some rather ill-defined way, of the principle (4.7). 

We therefore take as the final variational principle for the effective mass 
the minimization of the following expression Jy ; 


Jy=(¥,, | H—M . Viw|¥%,), | | ae 
with (CY, | Py =1 
The corresponding energy is 

E(V)= C8, pH | Pye, eee ees 


The variational principle (4.8) with (4.8 a) is apparently the most natural 
generalization of the usual variational principle to a moving system, for 
the case that the trial function does not diagonalize the total wave number. 
The principle has been applied in such circumstances in metal theory by 
Fréhlich (1953). 

There are two alternative definitions for the effective mass, we call 
these m*, and m*, ; 


}(m* ,/m) V2ho= CF, (V) | H | '%(V))— CH, (0) | | ¥%,(0)), (4.9 a) 
—(m*,/m)V*hw=Jy—Jo. . . . « « (4.9b) 


The factors on the left-hand sides of eqns. (4.9) appear because, as noted 
earlier, the velocity is V/(hw/2m). The usual expression, 4(effective 
mass) x (velocity squared), becomes }(m*/m)V2hw. 

As with all other known variational principles the use of eqns. (4.9 a) 
or (4.9 b) with the principle (4.8) does not give an upper or lower bound to 
the effective mass. Ifthe wave function | ¥(V)) gets worse as a solution 
of eqns. (4.8) as V gets larger then one should have m*>m*,, but this 
is by no means universally applicable. There is no reason to regard 
m*, as representing in any way an upper or lower bound to m*. 

When the method is applied to the polaron problem in Hartree approxi- 
mation it is found that m*,—m*,; this is satisfactory as it means that 
the variational method here picks out a wave function with the correct 
expectation value (m*/2m) V for the total wave number M. 

Finally it might be mentioned that the method gives the correct 
effective mass if used in conjunction with a displaced oscillator wave 
function (3.3). That this must be the case follows trivially since these 
wave functions diagonalize the total wave number, 
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4.4. Elimination of the Field Variables 


The variational principle (4.8) is easily applied to the Hartree trial 
function (4.1). 


We define 
Dix (x)rexp (= 4iN ox)... oe (4.10) 
We have 
Ty|heo= | Q°*(x) (— IV2—V2).0' (x) dhe + | 
he l—v.V)(@ He tb (iF) x (4.11) 
aa | 2'(x) | 2 exp (—iv. xjdr+ee.)b |b) 


We first minimize the lattice contribution to equation (4.11), letting 2’ 
be arbitrary. Evidently we have a set of displaced oscillators with 
displacements + 


Ara \ 12 
d (FS =) real | 92'(x ):] exp (—Iv. x)d3x. c (4.12) 


The lattice contribution is minimized by putting the oscillators into their 
lowest displaced states and their contribution is then + 


Eee nV a eet 3 (4,13) 
Thus, using eqns. (4.12) and (4.13) to eliminate the oscillators we obtain 
Ty [fies—> | Q* (x) (—EV2— V2) Q!(x)d8x— 
Aaa | OF (x (ewer c yas 2 ? (4 14) 
BESTE viv v) | (x) §2’(x) exp (—iv . x)d®x | 


where 
[2009.2 e0)ahe=1, 


4.5. Variation of the Electron Wave Function 
To find the effective mass we expand to order V? about the point V=0 
Doing this we obtain for the lattice contribution in eqn. (4.14) the 
expression 


—)' (4rraj Sv?) (I-+v. V+(v. ¥)?+0(V%)) x 


ea aes 
x | | 200) Q'(x) exp (—iv . x)d3x | 2 


Now there is a ‘selection rule’ which makes the term linear in v.V 
in the expression (4.15) identically zero for any 2’ whatever. For under 
rotations of v the expression v.V transforms like a vector while the 
expression 

| | 2*Q'exp (—iv . x)d8z | 2, 


t See the first footnote to §3.2, 
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being real and positive, can only transform like a sum of tensors of even 
rank. Thus the summation over the angles of v gives complete cancella- 


tion for this term. 
By virtue of the above selection rule one has, to order V?, 


Jyfio= | Q”*(x) (—4V2— V2) Q'(x)d3x— 
470 < 
= Sea(1H v. V)+-O() ‘)) 


The expression (4.16) has been derived by Pekar (1954, eqn. (10.23)) 
treating the polarization field classically. 

Evidently Jy, as given in eqn. (4.16) having taken account of the 
‘selection rule’, differs from J) only to order V*. Therefore if {2’y(x) 
and (x) are the best wave functions obtained by varying Jy for velocities 
V and zero respectively, then 2’y— 2, will be of order V?, while on the 
other hand the lattice state vector | ®(V)) differs from | 6(0)) already 
in order V since the displacements (4.12) differ in order V. 

These considerations make it quite unnecessary actually to carry out 
the minimization of Jy, except for the case V=0. For since 29 minimizes 
Jo, therefore Jy [{2)] can differ from J, [2’,] only in order (2’y— Q,)?, 
that is in order V*. This shows that to order V? the minimum value of 
the expression (4.16) is 


Min (Jy)=Jo[ 29 ]—4V2iw— 


2}. (4.16) 


: Q'*(x) Q(x) exp (—iv . x)d2x 


(4.17) 


A4rohw 
—_ 3 Ta28 (Vv : Vv) 
Having found a good function 29 one can now calculate the effective mass 
of the polaron by substituting eqn. (4.17) into eqn. (4.9 b), or alternatively 
by using {2 with eqns. (4.12) and (4.9a). Both of these procedures here 
give the same value for the effective mass, namely 


=1+ (4/3) Sv? | d, |. ee fie es EY, 


2 b. 
+0(V *) 


| 29* (x) 29(x) exp (—iv . x) d3x 


m* 
m 


4.6. Numerical Results for the Landau—Pekar Static Approximation 

There are various simple forms which we may adopt for the trial 
function (x). Frohlich (1954) has used the wave function appropriate 
to the lowest energy state of an electron in a Coulomb potential. This 
wave function has the form 


(6%/87)¥? exp (—4B | x |). 
Inserting the above wave function into (4.17), it is found that J is least 
when B=5a/8. The corresponding values for the energy and effective 
mass are 
H(0)/hv= — 250.2/256 = —0-0977a2, 


(4.19 a) 


‘ | 3 
and — =14+ — ) ~4=1+0:0203 «4 


Polaron Rest Energy and Effective Mass 431 


The value of H (0) is a little improved by using a harmonic oscillator or 
Gaussian wave function (Pekar and Deigen 1948, Pekar 1954, Feynman 
1955) of the form (f/\/7)? exp (—4f2x2). j 
The best value for this parameter f is found to be 10/ (2/7), giving 


E(0)/hw= —-o?/37= —0-106 o2, 
and Ta 16a4 


wd 4.19 b 
Sr (4.19 b) 


—1-+40-0200 «4, 

J 
Pekar (1954) has slightly improved on these values by using a wave 
function of the form (1+a |x | +bx?) exp (~a | x |), obtaining thereby 
H(0)/Aa=—0-1088 «?, and m*/m=1-+-0-0208 «4. This may perhaps 
furnish a check on the adequacy of eqns. (4.19 b) for example, though one 
would expect to make the energy still lower by using a Gaussian factor 
instead of the exponential in Pekar’s improved wave function. 


0 


4.7. The Validity of the Static Approximation 


In establishing the range of validity of the static approximation we 
must bear in mind the following points: 

(i) The static approximation gives, for « larger than 10, a lower value to 
the rest energy than does the dynamic method. 

(ii) The critical length at which the classical polarization field is cut 
off is of order \/(#/2mw) in the dynamic method, but of order 


a t4/(h/2mw) =h?/ {me?(1/e .—1/e)} 
in the static method. 


(iii) We saw that for large « the dynamic method fails because of the 
recoil correlation energy. The Hartree approximation, on the other 
hand, keeps the recoil down while at the same time having many uncor- 
related quanta in the polarization field. This is achieved without 
complication only at the price of giving up the requirement that the total 
wave number shculd be on diagonal—the wave number of the electron 
being no longer correlated to the wave number of the quanta present. 


(iv) At the cross-over point «~10 the effective mass according to the 
Landau—Pekar method exceeds the estimate of Lee et al. (1953) by a 
factor of about 45. This enormous discrepancy clearly indicates the 
existence of a region near «=:10 in which neither the high frequency nor 
the low frequency approximations are adequate. It seems plausible that 
at the point «=10 the greater part of the discrepancy arises from the 
inadequacy of the Hartree approximation. This viewpoint is supported 
to some extent by Feynman’s calculation of the effective mass, to be 
discussed in § 6. But for sufficiently large « the considerations of parts 
1 and 2 of this section point fairly conclusively to the approximate 
validity of the Hartree approximation and therefore to the large values 
given in part 6 for the effective mass. 


P.M, SUPPL.—OCTOBER 1956 2H 
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Ped a! 


Clearly one defect of the Hartree approximation is that it necessitates 
a localization in space. The corresponding localization energy will be 
dim* (22M?) , using ordinary units ; or (m/m*)hw<{M?) using dimension- 
less units. (Here (M2) is the expectation value of the squared total 
wave number (eqn. (2.27)) taken in the Hartree state (4.1)). There 
are no correlations between different quanta or between quanta and 
electron, and therefore (M?)reduces to 


(—ad4ax2) + 3v8|d, |2, 


where d, is given by eqn. (4.12). 
For a Gaussian trial function (eqn. (4.19 b)) one finds 


(—@/dx® )==04/30, 7 dle 
>; v? |-d, | 2404/2 79. eee 
The effective mass, eqn. (4.19), is approximately (164/817?)m, so that the 


localization energy is 

2 4 
ae So hea Bie. J aaa 
This result can be shown to be independent of the form of the trial func- 
tion {2p (x). 

Pekar (1949, 1954) has shown that for large « the energy given by the 
Landau—Pekar approximation is too high by 3/w or more. We shall 
see later (§ 5) that this energy may be considered as the sum of two 
distinct contributions; one of magnitude #iw being the localization 
energy derived above, and the other arismg from interaction with the 
zero point fluctuations. 

We have not calculated the corrections of order V4. These corrections 
will fall into two categories according as to whether or not they depend on 
the change in shape which the motion will induce in the charge cloud. 
The latter corrections are unimportant for polaron kinetic energies 
less than «hw, as follows from the considerations of Pelzer (1950) and 
Pekar (1949, 1954)+. The shape dependent corrections in order V4 
may, however, be appreciable; this has not been investigated. But 
for kinetic energies less than hw one would not expect any large correction, 
so that the estimate (4.23) for the localization energy should be good. 


m 
adn! 
ae ha<(M = 


$5. THe Low Frequency ApraBatic APPROXIMATION 
5.1. The Fluctuation Energy 


The Hartree or static approximation implies the neglect of the zero 
point fluctuations of the polarization field (§ 4.2). Pekar (1949, 1954 
§§ 15, 16) has given an approximate treatment of the interaction of the 


+ The criterion given by these authors is that the frequency of the forced 
vibrations of the polarization shall be much less than the natural frequency w. 
Thus one has V<@Amin. Where Amin, is the minimum wavelength involved, 
i.e. the radius of the charge cloud. 
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electron with these fluctuations. The idea ig that the electron follows 
the polarization fluctuations adiabatically, so that it never occupies an 
excited state in the potential due to the polarization.t The effective 
Hamiltonian is then the sum of the polarization Hamiltonian and the 
energy eigenvalue of the lowest electronic state in the instantaneous 
polarization field. The electron degrees of freedom are thus integrated 
out of the problem. In principle this would be exact (up to and including 
terms of order «° in the energy) for sufficiently small lattice frequency 
(large «). However, it is difficult to deal exactly with the motion of an 
electron in an arbitrary potential, and therefore some variational method 
is indicated. Pekar’s choice is the simplest one could make, yet brings 
out the essential features of the problem. It is based on the facts that the 
wave function must be a superposition of states in each of which the polari- 
zation never departs much from that appertaining to a Hartree state 
localized at some point r in the lattice, and that. correspondingly for each 
of these states the electron’s wave function will be approximately of the 
form 2)(x—r) where {5 is the electronic wave function used in the 
Hartree solution.{ But since all the values of r are equally accessible 
Pekar regards r as a variational parameter whose value is to depend on the 
co-ordinates q specifying the state of the polarization. Explicitly, one 
must work in a representation in which the polarization variables (i.e. 
the q) are diagonal, and to each set of eigenvalues of the gq one must 
attribute a corresponding r(q) such that 2 9(x—r (q)) is a variational 
function minimizing # + A int, (q)- 

Working on these lines Pekar has derived an effective Hamiltonian 
for the problem, including the scattering of a polarization quantum by 
the polaron, etc., and has given also in this approximation a variational 
derivation of the energy of a polaron§. He obtains the Hartree value 
for the effective mass and finds that the energy lies by at least 3iw below 
the Hartree value. 

Although the ideas lying behind Pekar’s theory are so simple the calcu- 
lations are very complicated because it is impossible actually to find any 
analytic representation of the expression r (q). 

However, by applying the adiabatic approximation to a localized 
polaron, and then subtracting the localization energy already calculated 
in § 4.7, we can obtain Pekar’s energy values without any complicated 
mathematics. We shall carry out the calculation in part 3 of this section. 
First, however, we must define the ¢ representation. 


+ This is essentially the Born—Oppenheimer approximation, which has been 
discussed in some detail by Born and Huang (1954, Chapter IV). 

+ Thus changes in shape are not considered in Pekar’s approximation. 

§ Pekar’s derivation of the effective Hamiltonian cannot be recommended 
as it involves a peculiarly unsymmetrical coordinate transformation. — The 
interested reader should instead consult the work of Bogoljubow and Tjablikow : 
see §§ 5.5, 5.6 of the present work. 


ZH 2 
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5.2. The q Representation 
To express the adiabatic approximation in a clear way we represent the 
lattice polarization by a set of canonical harmonic oscillator variables qyy 
and p,, referring to an expansion in cosine («=1) and sine («=2) waves. 
(Here v runs only over a half-space : i.e. a space in which, if a vector v 
occurs, then —v does not occur.) The essential point is that the q,,y 
and pay are chosen so that the interaction involves only the g,y. We define 


dw=Hib,t+b_yt—by—b_y), 


My=t(byt+b_,++b,+b_,), | 


dn 2H Dees oe 
Day Hil, bly abe ae 
Using the relations 
by=3 (Lov + iP ov +idiv— Pry): } 
Sg ee ee ee 


b_y=3( Toy IPoy 1q4y Pw) 


then enables one to rewrite the Hamiltonian (2.1) in the perhaps more 
familiar form 


Hh {—V?+ Ya 3 av? + Paw’) | 
An; 1/2 1 
sr (=) ve 7, (dav COs (V . X)-+qo, Sin (Vv . x) | , (5.3) 


where ” denotes a sum over half space. 
For the Hartree state at a point y we have oscillator displacements d,, (y) 
as follows : 


4r-a\ U21/ 
aaty)=2 2 (=) | | 29(x—y) |? cos (v . x) da 
‘ v . (5.4a) 
—d,y(0) cos (v . y) 
4r0\2] 7 
ile = — Z| —_— - — 2 sj } 
d.(y) 2( 5 ) = | | 2Qo(x—y) |? sin (v . x)d32 (5.4b) 
=d,, (0) sin (v. y) 
For all values of y we have 
d,,=si(d,t+d_,+—d,—d_,), . . . . (55a) 
doay=4(d,+—d_,1 “hy hy), fic ee ed 
From eqns. (5.5 a), (5.5 b) and (4.18) we have 
Dave ay(¥)¥Y;= > dy,2(0)v,7,= (m*, Qn jJO.psh so Se re 


5.3. Derivation of Pekar’s Results 
We now consider a polaron localized at the origin, but with the electron 


following the lattice fluctuations adiabatically (in Pekar’s approximation 
that the only variational parameter is r(q)). 


T P wave function is thus of the following form, apart from a normaliza- 
tion factor, 1a | 


829(X—r(7)) exp{—} > wy(Qav—Ayy( 0))?}. ata ee 
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Since the polarization is localized (apart from small fluctuations) to form 
a deep potential well, therefore the coordinate r(q) will be small, and we 
can use Taylor series expansions in r(y). This makes the calculations 
easy. (Pekar does not assume a localized polaron and therefore such 
expansions cannot be applied to his calculations.) We define 


Gedo oO). PN a AD Pbk Wal (5.8) 
The value of the operator [2 *(x—r) (He1.+ Hips.) Qo(x—r) d3x differs 
from its Hartree approximation expectation value by the following expres- 
sion AH(q, r), (using eqn. (5.3)), 


AK (g,r)= 2hw (=)" > “| §29(x—r) |? {cos (v . x) 


| 
\ . . 
x (Qiy—4y,(r)) + sin (Vv : x) (Joy— Loy ( r))} dex | 
Using eqns. (5.4) we may rewrite eqn. (5.9) in the form 
AE(g,r)=hw >’ ,d,,(0){d,,(0)—q,, cos vV. r—qo,sinv.r}. . (5.10) 


Then eliminating the q¢,, in favour of the q’,, by eqn. (5.8) we have, 
approximately, 


AE(q,r)=hw >’ ,d,,(0){$d,,(0) (v. r)?— | 
Se ee ar Ors rq’). | 


Using eqn. (5.6) and minimizing AZ with respect to r we find that we must 
take 


(5.11) 


r(q)=(2m/m*)S’diy(0)q' pV +O(q2), .  . (5.12) 
which gives 
AK(g,r (g )) =—hw>" vt4y(0 )¢’ Ly ae 
—thea(m[m*) Sow Tyy(0) Faye (0)¥ WY ov Yow FO(q). . . (5-13) 


The energy of the localized polaron will be lowered below the Hartree 
value by the expectation value of the correction (5.13) in the polarization 
state ®| 0) appropriate to the Hartree solution { f. 

To evaluate the expectation value we use 


CG yy =O; 
Gerd eee emu es eet ig alot) 
hence, using eqns. (5.13) and (5.6) 
(AE(q,"(q))> =—(m/2m*) >" d4,7(0)v?iha= — tha. ee se 3) 


+ The energy values corresponding to AH obtained by this method are higher 
than the true energy, except for neglected terms of order 1/x; for the method 
can be put onto a variational footing without difficulty. 

tIt would be incorrect to add the correction (5. 13) to the oscillator 
Hamiltonian and then try to diagonalize the sum exactly ; this would lead toa 
de-localization and to a breakdown of the expansions used. This matter will 
be further discussed in § 8. 
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Adding to this fluctuation energy the localization energy — hw obtained 
in § 4.7 we obtain for the total lowering of the energy in Pekar’s approxi- 
mation the amount (3/2)iw. This agrees with Pekar’s result 


Epexar= Hartree approx. (3/2)ho. hed E (5. 1 6) 


5.4. The Work of Bogoljubow and Tjablikow 


Pekar’s treatment of the adiabatic approximation, although bringing out 
many interesting features, is essentially inexact, the electron wave func- _ 
tion being allowed only to move about through the lattice, and not being 
allowed to change its shape. This deficiency in the method has been 
removed by Bogoljubow and Tjablikow. Fortunately there is now a 
German version of the work of these authors (Tjablikow 1954) as well as 
the original papers in Russian (Bogoljubow 1950, Bogoljubow and 'Tjabli- 
kow 1949, Tjablikow 1951, 1953 a). 

The interaction with the fluctuation part of the field is taken into 
account exactly (with neglect only of energies of order 1/~) by allowing 
the electron to go into a complete set of excited states defined with 
respect to the effective polarization potential obtained in Hartree approxi- 
mation. In addition the localization energy is removed by allowing a 
freedom of location to the origin with respect to which the Hartree 
approximation is defined. This introduces three superfluous co-ordinates 
analogous to Pekar’s variables r (q) and has to be compensated by the 
imposition of three subsidiary conditions without which the representa- 
tion would not be unique. The subsidiary conditions are chosen to make 
the origin of the Hartree approximation energetically favourable for the 
electron. 

The effective Hamiltonian thus obtained for the system ‘ polaron-+ free 
quanta ’ differs from that obtained by Pekar (1949, 1954) by the addition 
of extra terms +. The effective mass is unaltered, however. 

We shall not enter here into the details of the work of Bogoljubow and 
Tjablikow, but shall merely generalize the fluctuation energy calculation 
for a localized polaron given in § 5.3, so as to give an accurate account 
of the motion of the electron in the fluctuating polarization field. This 
will furnish a check on the more complicated calculations of these two 
authors. 


5.5, Calculation of the Fluctuation Energy 
The electron Hamiltonian to be integrated out from the problem is 
written as the sum of two parts, the first being the unperturbed part 
(which does not involve the fluctuations), namely 


V2 » dra. ne yA : 
2 a >’ = (d4,(0) cos v. x-+d,,(0) sinv.x), . (5.17) 


U 


} It also differs in that complete symmetry is maintained between all wave 
vectors v, albeit at the expense of introducing subsidiary conditions (ef. the third 


footnote to § 5.1). 
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and the second part being a small perturbation on the motion of the 
electron, 


Aro, \ 12 ! 1 , 1 F 
(= » 5 @ Ww. COS'V..X-+q’o,siny.x). . . (5.18) 


Then, taking a complete set of eigenfunctions Q,,(x) for the Hamiltonian 
(5.17), with eigenvalues ¢,, we eliminate the motion of the electron by 
standard second order perturbation theory and replace the approximate 
expression (5.13) by the following expression, which is exact to order q’?. 
nee Syl | 
Vv S vw nz0 VW(E,— €g) 
x (q'1v<0 | cos v. x | n)+9'.,0 | sin v. x | 2)) 
x (Viw{m | cos w. x | 0) +q'ay <n | sinw. x | 0)) j 
The expectation value of the correction (5.19) in the polarization state 
appropriate to the Hartree approximation is then 


. (5.19) 


Sel nae 
(4H(q'))=— “= > > 

BD v n#0 

|<n |cosv.x|0) P+ | <n] sinv.x]| 0) [2 


; Ly : 
V2(€,—€,) + hw O (3) c (5.20) 
_ Os y | <n | exp (iv. x) | 0) Liew O (5) 


S v n<0 we(en=e,) 

This expression agrees with that obtained by Bogoljubow and Tjablikow f, 
and is variationally correct for large «. The rest energy of the polaron 
must therefore lie below or coincide with { the following upper bound 
(assuming the validity of our simple derivation of the localization energy), 


E=L yartree 
Ara Here lecrnliy ex 0 yo |< 6m ammera me eee I 7 
= aan ie ps seal es SRE SU OY) yee —}. (5.21 
9 2, 2s ices) hw—ho+hwO a2 (5.21) 


Now Bogoljubow and Tjablikow obtain —(3/2)hw instead of —(3/4)hw for 
the last term of eqn. (5.21). Thus while the calculations presented here 
can explain Pekar’s results, they do not explain those of Bogoljubow and 
Tjablikow, whose work should therefore be subjected to further examina- 
tions. 
5.6. Estimation of the Fluctuation Hnergy 

In evaluating the fluctuation energy (5.20), we come up against the 

difficulty that the exact solution to the Hartree problem is not known. 


+ The expression (5.20) can also be derived by considering the interaction 
(5.18) simply as a perturbation of the total Hamiltonian of the system, as has 
been shown by Hohler (1955 b), i.e., without invoking the adiabatic approxima- 
tion. 

+ In §8 arguments will be presented making it extremely plausible that the 
rest energy coincides exactly with (5.21). ee. : 

§ The error almost certainly comes in the diagonalization of the expression 
(38) in Tjablikow (1954), and is connected with the subsidiary conditions. 
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But we expect that the use of an approximate solution for the latter 
should give a reasonable estimate for the fluctuation energy. 

More serious, however, is the fact that the eigenfunctions of the 
Hamiltonian (5.17) are in any case unknown. The potential energy term 
in the Hamiltonian (5.17) corresponds to a Coulomb field cut off near the 
origin at a distance x of order «~1(cf. eqn. (5.32))T. 

Feynman (1955) has given some powerful variational calculations for 
large « based on a harmonic potential. Clearly a harmonic potential 
bears little resemblance to the true state of affairs. Nevertheless it 
is interesting and easy to evaluate the expression (5.20) using a harmonic 
potential in the Hamiltonian (5.17) ;. this casts considerable light on the 
meaning of Feynman’s method, and furthermore turns out to be a 
justifiable approximation to the problem. (The reason for the latter will 
become quite apparent subsequently.) 

Thus we shall replace the potential energy term in the Hamiltonian 
(5.17) by the following harmonic potential appropriate to the harmonic 
oscillator wave function (4.19 b) used in Hartree approximation 


V nam =B4x2—(15/2)62 with B=4ay/(2/r). . . . (5.22) 


This harmonic potential has been so chosen that the exact lowest eigenvalue 
and eigenstate of the modified Hamiltonian (5.17) agree with those 
obtained variationally for the original Hamiltonian (5.17), i.e. with 
(4.19b). The excited states will be quite different. Nevertheless the 
calculation of the fluctuation energy (eqn. (5.20)) using the harmonic 
potential (5.22) turns out to give precisely Feynman’s answer! The 
calculation runs as follows. 

We define new variables y =$x, w=(1/8)v, in terms of which the modified 
form of the Hamiltonian (5.17) becomes 


22 i 2 ly2 mh pe 

269(—— 4 VE Ly a ee ee ee 
ie. the standard oscillator functions may now be used to describe the 
states 2, available to the electron. 


The matrix element <n |exp(iv.x)|0) for a one-dimensional 
oscillator described by the Hamiltonian (5.23) is 


(10)" exp (—jw*)/4/(2"nl), 2 fy (6.24) 
Hence the fluctuation energy (5.20) becomes 
3 Amahw y a exp (— tw?) WP" w.2 "1.2% 
S(w)p owe ; / N+ Netn ! . (5.25) 
Wy, yng SW (My + No+Ng) Itt, ! no! Ne! 


(Ny +g +-Ns A 0) 


Writing w=uy/2, substituting B—=4«\/(2/7), changing the sum into an 
integral, and using the identity 


rie) te 80) (RR Reopen, ‘ 
f Clearly a reasonable procedure would he to tackle the problem of the 
fluctuation cnergy variationally, but we shall not do this here. 
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N 
(a? y2+ 22) = il yom gan 
N! — = Lap Te aE ae! 
we obtain aoe 
AR] ay! By ed) ees iF f yen 
(4H /hw) ; a>), exp (—w?) ive ae (5.27) 
N=1 


(It is noteworthy that the term N—1 in eqn. (5.27) gives —3, i.e. the result 
obtained using Pekar’s adiabatic approximation (§ 5.3). This is not sur- 
prising because the first excited state of a harmonic oscillator is the space 
derivative of the ground state.) 

To evaluate the whole of the sum (5.27) we may replace the factor 


1/N by 


so obtaihing a double integral in which the summation over NV can be 
explicitly carried out to give an exponential, thus 


—3/° hohe | 
(AB hw) = — | du daz - exp (—w?) (exp (ww?)—1) . (5.28) 
Se POs aaigy! —& 
as sab 1 1 ay 4 
2 bea -oesae ) Brae MEU ee Boy ane VOSa0) 
Putting 1—x=z? we then obtain 
ade 
E/hw)=—? === 3 OM Zain ee ae: 
(AE|hw) ieee 3 log (5.30) 
Thus, including the localization energy (4.23) we have 
EH=(—2«3/3a—3 log 2—2)tw. . . % . = (5.31) 


This is the same as the result obtained variationally by Feynman (1955). 
Thus Feynman’s calculations must in some way take account of the 
fluctuation energy and of the localization energy. 

Although Feynman obtained eqn. (5.31) variationally this does not in 
itself necessarily mean that —3 log 2 is a good value for the fluctuation 
energy.. For the first term in Feynman’s asymptotic expansion (5.31) 
is known to be approximate (compare § 4.6), and this could conceivably 
take away all significance from the second term. The crux of the matter 
is whether or not one would obtain anything comparable to —3 log 2 if 
calculations were based on the exact potential instead of on a harmonic 
potential. 

Clearly the calculations based on a harmonic potential will be reasonably 
satisfactory if the exact potential and harmonic potential agree wherever 
the electron’s wave function is large. 

An evaluation of the potential V(x) appearing in the Hamiltonian 
(5.17), using the d,, appropriate to the oscillator wave function (4.19 b) 
can be shown to give 


% 
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2)» 1 (ieee 
VOU Bota lO) cs Frm Ba (5.32) 
where @ is the error integral, i.e. 
2s : 
j= — —?) dt. in), Sere oe) 
D(s)= = | exp (—?) 


jx 


Comparison of the Hartree and harmonic potentials for the Gaussian trial 
function. 

(a) Harmonic potential (5.22), 

(6) Hartree potential (5.32). 

(c) Coulomb potential. 

(7) Energy of electron (#/B?=—4-5). 
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The potential (5.32) is compared with the harmonic potential (5.22) in 
the figure. It will be seen that these potentials agree satisfactorily 
except in regions well above the actual energy level of the electron. We 
conclude that the use of a harmonic potential gives a reasonable estimate 
of the fluctuation energy (5.20). 


§6. TH [INTERMEDIATE CoupLiInGc REGION—FEYNMAN’S CALCULATIONS 


From the remarks made in § 4.7 and § 3.4 it is quite clear that there is 
an intermediate coupling region where neither of the low or high frequency 
approximations is adequate to treat the effective mass problem. There- 
fore, as remarked by Frohlich (1954), it would be advantageous to have a 
variational approach combining the merits of the two methods. 

A first, but unfortunately not very successful, guess for a suitable wave 
function was made by Buckingham (1954) (see §7.5). Recently definite 
advances have been made by Feynman (1955) and by Hohler (1954, 
1955a, 1955b)+. The methods of these authors admit a continuous 
transition between the low and high frequency approximations, but as 
yet have not been exploited to resolve the effective mass problem : 
the numerical work required would be quite considerable. 

Here we shall very briefly discuss the work of Feynman and Hohler, 
and quote their results, examining where possible the relationship to the 
work already described. 

Feynman’s variational approach to the problem is based on his wave 
mechanical adaptation of Huygen’s principle. The transition amplitude 
is represented as a © path integral’. The field oscillators are integrated 
out in terms of the path of the electron. Thus if the electron emits a 
quantum when at space time point (x, t), it experiences at a later space 
time point (x’, t’) a potential proportional to 

Seis ie 4 Rehr es be (651) 
x] 
The Coulomb potential in (6.1) is too difficult to treat in the path integral. 
Feynman has replaced it by an expression of the form 


cexp [—iw(t’—t) ] (x’—x)?, Ah Ps) Pies! 59 © (6x2) 


which can be treated exactly in the path integrals. The energy is 
obtained variationally by an averaging over a motion governed by (6.2) 
and the parameters c and w are varied to give the best values. At first 
sight the expression (6.2) looks as though we have an electron interacting 
with a polarization field via a harmonic potential instead of a Coulomb 
potential |! This is not, however, a profitable or sensible interpretation of 
(6.2)! The reason why (6.2) is successful is that, according to Feynman, 
it gives the same motion for the electron that one would have if the 
electron were bound harmonically to a second fictitious particle. It is 
+I wish to thank Professor Frohlich for showing to me pre-publication 
copies of the work of Professor Feynman and Dr, Hohler. ; 
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a peculiarity of the harmonic potential that the introduction of a fictitious 
particle in this way leads to an expression so similar in construction to 
(6.1). 

For strong coupling the fictitious particle is made very heavy so that the 
electron at any given instant is moving in a harmonic potential which is 
itself moving slowly through the lattice. This takes account of the 
localization energy —3/4hw. It also takes account of the fluctuation 
energy —(3 log 2) iw because the motion of the electron is governed 
by the harmonic potential but is otherwise unconstrained, all excited — 
states in the potential being taken into account in the path integral. 
Thus Feynman obtained for the strong coupling case just the expression 
(5.31) which has been derived here by a simpler method. 

By allowing the electron to move freely, (c=0 in (6.2)). Feynman 
obtains the upper limit —ahw already found by Gurari (1953), and by Lee 
and Pines (1952). 

By adjusting ¢ and w suitably for small « Feynman finds perturbation 
expansions for the energy which are given in table 1. They will be 
seen to compare quite favourably with the results of Lee et al. (1953), 
which latter as already mentioned should be exact to order «? as well as 
approximate for «<6. 

For intermediate coupling Feynman’s method requires numerical 
minimization of a one-dimensional integral. 

The effective mass for small « obtained by Feynman is given by 


m* m= 1-+-0/6--09/40-F et ge ne 


For large « his method gives, in first approximation, the Landau- 
Pekar value (4.19b). Carrying the asymptotic expansion one stage 
further than in Feynman (1955), one finds (we shall not discuss the details 
of the calculation, which follow quite straightforwardly from Feynman’s 
eqn. (45)) 


m*/m= 16a 4/811? — (402/37) (14-2 log 2)4+... | 
nye a ‘ ara 5 
=0-0200 a4—1-01 2+... | 


Writing eqn. (6.4) as 200(%/10)4—101(a/10)2+ ...we see that the 
asymptotic expansion is quite inadequate in the transition region a= 10 
but should be fairly good for «> 20, sayt. 


§7. THe [nrermepiare Courting Reqion—Houver’s CALCULATIONS 
7.1. Hoéhler’s Ansatz 

Hoéhler (1954, 1955) has formulated a variational wave function 

superior both to that used by Gurari and Lee and Pines (§$ 3), and to 


+ This conc lusion 1s confirmed by some recent calculations by Héhler (private 
communication), A mass correction quite comparable in order of magnitude 
to that in (6.4) is obtained from the ansatz (7.1 a) : 
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Pekar’s Hartree wave function (§ 4). The same wave function was 
independently proposed by Tjablikow (1953 b). The method consists 
in constructing an eigenstate of the total wave number by superposing 
(with suitable phase factors exp (ik.y)) a set of Hartree-type states 
localized at various points y in the latticet. 


Thus the following expression is taken to describe a polaron with total 
wave number k 


| Y(k)>= fd3y exp (ik. y) Q(x—y) exp {>\d,(0) exp (—iv. y)b,+} | 0). 
mh Nee Cia) 

On substituting x—y——z we have 

|Y(k) )=exp (ik . x) fd?z.Q(z) 

Xexp (ik . z) exp {exp (—iv . z) d,(0) bjt exp (—iv. x) |0). (7.1 b) 

The combination b,+exp (—iv.x) appearing in the form (7.1 b) con- 
serves momentum, which makes it quite clear that we are dealing with a 
momentum eigenstate. Clearly the ansatz (7.1) must be better than the 
Hartree approximation because it removes the localization : and it 
cannot be worse than the Gurari type expression (3.3) since it is seen to 
reduce to the latter if one chooses for ((z) a Dirac 6 function 6%(z). 
In fact, it must give a real improvement on the Gurari type wave function 
if 2 is suitably chosen. For the integral over z of Q(z) exp {—i(v,+... 
+v,).z} will be small if v,+...-+v, is large, i.e. if the electron 
recoil is large. It was seen in § 3.4 that it is just because of this recoil 
that the dynamic method becomes inadequate for strong coupling. 
Evidently Hohler’s ansatz introduces the required correlations. Further- 
more, Hoéhler has shown (1955 b) that for weak coupling his method is 
practically as good as perturbation theory (see fable 1). Thus it is 
evident that the method is capable of giving a continuous transition 
between the low and high frequency approximations. 

Unfortunately, however, a considerable amount of numerical work is 
yet required in order to do justice to the ansatz in the intermediate 
coupling region. The numerical integrations involved are usually one- 
dimensional (see Héhler 1955 a), but numerical minimization is required. 


7.2. The Polarization as a Fictitious Localized Particle 
Equations (7.1) represent the polarization field as behaving in some 
respects like a ‘ particle’ at the point y. The state vector describing this 
‘ particle ’ is the expression 


exp {>,4,(0) exp (—iv . y) b+} | 0). 
The electron at x and the ‘ particle’ at y are correlated in space by the 


radial wave function Q(x—y) in the same way that an electron and proton 
are correlated in a hydrogen atom in an eigenstate of the total momentum. 


+ Just as one sometimes constructs Bloch wave functions by superposing 
suitably phased atomic wave functions referred to various lattice sites y. Here 
the sites form a continuum, however. 
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The pair, electron-+‘ particle ’, can appear anywhere in the lattice, 
because of the integration over y (compare Frohlich 1954, § 6.2). The 
total momentum of the system arises through the factor exp (ik . y) 
in eqn. (7.1 a) or through the factor exp (ik . x) in eqn. (7.1 b). 


7.3. Improved Version of the Ansatz 
In general it is better to replace the factor exp (ik . y) by, for example 


exp {ik . (yq-1—x(1—n)n“)},,. - «+ ((7-2) 


where 7 is a further parameter. Then 7=1 corresponds to a very heavy 
‘particle’ and is appropriate to the strong coupling region (in general 
we would suppose the momentum carried by each particle to be pro- 
portional to its respective mass, thus the mass of the fictitious particle is 
m|(j—1)). Inserting the factor (7.2) into eqn. (7.1 a) and making the 
substitution x—y——z as before, we find an improved form | ¥’,(k)) 
for Hohler’s ansatz, 


|" (k)) =exp (ik . x) fd? z Q(z) exp (ik . z/n) x 

xexp {>, exp (—iv . z)d,(0) b+ exp (—iv.x)} ]0). . . (7.3) 
Hohler was led to a wave function | ¥,(k)) similar to the wave func- 
tion |Y’,(k)) (7.3) by a different route. He constructed first a localized 

wave function (H6hler, 1955 a, §§ 10, 11 and 12) of the form 

[¥(y)) = 2(x—y) x 
xexp{Syexp [—i (I—n)v . (xy) Jdy(O)bytexp (—iv. y)} ]0). . (7.4) 
Then he constructed an eigenstate of the total wave number from the 
wave function (7.4) as already explained, thus 


|'¥,(k)) == Jd3y exp (ik. y | ¥,(y))>. 


The substitution x—y—=—z/n gives 
|,(k)> =exp (ik . x) | d3 = QQ(z/n) exp (ik . z/n) x 
x exp {Dy exp (—iv . z) d,(0)bytexp (—iv . x)} | 0). ead (aed 


If 2 is varied freely then the ansatz (7.5) is equivalent to (7.3), and the 
only difference from the first ansatz (7.1) lies in the factor exp (ik . z/n). 


7.4. The Connection with Feynman's Work 

In Feynman’s method the electron is made to move under the influence 
of a second fictitious particle to which it is bound harmonically (§ 6), all 
excited states in the potential being available. We have just seen that 
in Hohler’s method the centre y of the lattice polarization plays the role 
of the coordinate of a second particle, but the electron is not allowed to 
go into excited states in the potential due to the second particle. There- 
fore the two methods are not equivalent, and in fact for strong coupling 
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Feynman’s must be superior as it gives a good account of the fluctuation 
energy which can only be brought in through inclusion of all the excited 
states. 


7.5. The Fluctuation Energy and Héhler’s Localized Ansatz 


We shall now examine the significance of the localized wave function 
(7.4). This wave function was introduced independently by Buckingham 
(1954) (Buckingham’s parameter a corresponding to 1—y in Hdhler 
(1955 a), §§ 10—12). 

Our main interest in (7.4) is that it furnishes, through Hohler’s calcula- 
tions, a valuable check on the validity of the estimate 2/w for the localiza- 
tion energy (see § 7.6). 

If a Coulomb wave function (4.19 a) is used for Q in (7.4) and all 
parameters varied then for large « Buckingham finds the optimal values are 
B=3a and 1—yj=32/(5a?), and the energy is lowered below the Hartree 
value by 24w. Likewise Hohler has found (private communication) a 
lowering by 3/w using n= 977/(4«7) with a Gaussian wave function (4.19 b). 

Hohler’s result can be completely explained by showing the equivalence 
of | (y)) with the localized Pekar-type adiabatic approximation of 
§ 5.3. This equivalence holds only if a Gaussian wave function is taken for 
the electron, and of course only for large «, i.e. to order «° in the energy. 

To demonstrate the equivalence we expand the exponential 
exp {—i(l1—n)v . (x—y)} in eqn. (7.4), this being legitimate since 1—n is 
small. We obtain, taking y=0, 


Paty pyro th 11) 0 (0)0, VE xa. F(X) X 


ox ode 00 cr )[Opemm eee as. (7.6) 
For a Gaussian one has 
x Q(x) — ee (7.7) 
hence the wave function (7.6) becomes, to the order required. 
Q(x—r) exp (dd, gg ED it eS ee NAYES 
where r is the following ee 
ras—i(1—7)8 *> a,(0)\07 Vo se ee ae (7.9) 


Changing over to the sine and cosine expansion by eqns. (5.2) and (5.5) we 


find 
r=4(1—n)B?>’4,,(0) (Gav—IDav)V- nee rm ERLAD) 


The displacements d,, for the oscillators involved in eqn. (7.10) are zero, 
thus Joy= 7 oy. Moreover, in eral analogy to eqn. (7.7) we find that 


Poy EXP (Yydy(O)by+) | OP =idoy exp (Yv4y(0)b,*) | 9? 
Thus we can replace (7 sine by an expression involving only q’,,, namely 


r—>(1—7)B 2 >’,d,,(0)¢oy¥=(1—n) Bee attic. 0)0 eVect een) 
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Substituting | —jy=97/(402), B=40/ (2/7) and m*/m= 16a 4/(817?) one finds 
r—>(2m/m*) >" diy(O)g Ve) = nee 


A comparison of eqns. (7.12) with (5.12) and (7.8) with (5.7) now makes 
the equivalence of the localized ansatz (7.4) with the localized Pekar-type 
approximation of § 5.3 obvious. 


7.6. Results for Strong Coupling 


Hohler (1955 a) has obtained asymptotic expansions from the momen- 
tum eigenstate (7.1), using the Gaussian 2 given by eqn. (4.19 b) with the 
d, specified in terms of 2 by the Hartree approximation (i.e. by eqns. 
(4.12) or (5.4)). We would naively expect just the removal of the localiza- 
tion energy which in § 4.7 was estimated to be #/@. The method does 
not allow the electron to follow the polarization fluctuations adiabatically 
and therefore would not be expected to give as good a result as Pekar’s 
variational formulation of the adiabatic approximation.t Nevertheless 
the result reported is that obtained by Pekar, i.e. a lowering of the energy 
by 3hw below the Hartree value. 

Thus the localization energy of the Hartree state is just double the value 
3fhw estimated in § 4.7, eqn. (4.23)! (But we shall presently see that only 
the value iw has an ultimate physical significance.) 

Another quantity which Hohler (1955 b) has calculated from the 
momentum eigenstate (7.1) is the effective mass for large «. This turns 
out to be one half of the Landau—Pekar value 16%4/(817?). 

These two surprising results are probably related, since according to the 
method of § 4.7 the localization energy is expected to be inversely pro- 
portional to the effective mass. 

Since we have every reason to believe in the mathematical validity of 
the Landau—Pekar mass value (which in fact has been derived rigourously 
by Bogoljubow and Tjablikow), we must conclude that the momentum 
eigenstate (7.1) is not suitable for calculating the effective mass. It must 
be an impure state, containing admixtures which double the kinetic 
energy as a function of the momentum. 

In view of these circumstances we must be very cautious in applying the 
simple estimate j/w for the localization energy. We must take note of 
the fact that the true value (i.e. the value calculated using Héhler’s 
method) will depend on whether or not the localized state used can in 
reality be considered to be a wave packet built up from true polaron 
states, without higher admixtures. If it can be, then the value sho 
ass oA we 

+ Hohler’s wave function in the representation of § 5 becomes 

Jay Q(x—y) exp {—Fyav (Gav—dew (y))?}. 
This differs from Pekar’s wave function (eqn. (16.1) in Pekar 1954) which can be 
written in the form 
Ja?y8(y — r(q)) Q(x —y) exp { —F>"av(daev—Cavl ¥))?}, 
where, for any configuration g, the electron energy is minimized by the electron 
wave function Q(x—r(q)). 
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should hold (since the mean square wave number is in first approximation 
independent of whether or not we deal with a pure state). Now to make 
sure that we are dealing with a pure state we must allow the electron. to 
follow the lattice vibrations adiabatically ; a consideration which gives 
every support to the opinion that the final formula (5.21) for the total 
energy is correct. 

This conclusion is further upheld by some recent calculations with the 
state |¥,(k)), eqn. (7.5). Hodhler (private communication) has found 
that varying 7, using a Gaussian Q, gives a minimum rest energy agreeing 
with Pekar’s value (5.16). Since the best localized Gaussian state 

|Y(y)>, eqn. (7.4), has an energy already lower than the Hartree value 
by ghw (cf. §7.5), we have here an actual example of a case where the 
removal of the localization only brings down the energy by 3/w. 


§ 8. CONCLUSIONS 


This review represents an attempt to understand and illustrate the 
principal mathematical features of the polaron problem, so far as these 
are known at present. It is hoped that the considerations advanced here 
will help to show the way to improved approximation methods covering 
the whole range of the coupling parameter. 

To avoid an unnecessarily unwieldy summing up, the results of calcula- 
tions on the problem have been collected in tables 1 and 2 following the 
present section. Here just the minimum of results required to elucidate 
the situation are quoted. 

The properties of a low energy electron in a perfect isotropic polar 
crystal are seen to be determined by the magnitude of a dimensionless 
coupling parameter «, which is inversely proportional to the square root 
of the oscillation frequency of the longitudinal polarization modes of the 
crystal (eqn. 2.26). 

When the electron emits or absorbs a polarization quantum of short 
wavelength it receives a large recoil momentum. If «<6 the frequency 
w is sufficiently large for most of the polarization field to follow the 
electron through space during such arecoil. This suggests a wave function 
in which the state of each polarization oscillator depends only on the 
space coordinates of the electron (§ 3). Such a wave function can be 
used variationally for all values of «, showing that the rest energy of a 
polaron lies below —ahw. This gives a valid description of the true 
state of affairs only for «<6 however. The method can be improved 
somewhat, by using the perturbation theory of Lee et al. (1953) (§ 3.3). 
The rest energy and effective mass on this picture are approximately 
{—a—1-4(«/10)?}2@ and {1-+-%/6+2-0(«/10)?}m respectively. 

As « is increased (w decreased) it becomes progressively more difficult 
for the shorter wavelength polarization modes to follow the fluctuations 
in the electron’s position coordinates, and the high frequency approxima- 
tion breaks down. ‘The failure of the approximation can be traced to the 
need for correlation between the polarization quanta. In the exact 
wave function the quanta with high momenta must be correlated so that 
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their momenta will largely compensate each other, thus reducing the 
recoil velocity of the electron. 

The problem can be tackled also from the opposite extreme of very large 
az, say 4220. In this case w is very small and the electron can rapidly 
adjust itself to take best advantage of the instantaneous state of the 
polarization field. Thus in this low frequency approximation, the electron 
follows the polarization adiabatically. Since the field cannot easily move 
with the electron, it follows that the electron must stay with the field if 
the energy is to be made low. This suggests a Hartree approximation in 
which electron and polarization are localized in space (§ 4). This approach 
gives a polaron rest energy and effective mass of order —0-la?hw and 
0:02«4m respectively. 

The Hartree method is not quite a true representation of the low 
frequency approximation, for the electron is not following the polarization 
changes adiabatically, nor is the total momentum diagonal. A complete 
formulation of these requirements necessitates a quite elaborate mathe- 
matical apparatus, which has been set up by Bogoljubow and Tjablikow. 
The situation can however be examined quantitatively in a simple manner, 
asshownin § 5. This simplification arises through attributing the changes 
in the polarization field to two causes. In the first place the field, being 
quantized, must necessarily be subject to zero point fluctuations. If 
these fluctuations were just the same as in the Hartree approximation 
then the polaron would remain localized, but its energy would be lowered 
through the adjustment of the electron’s wave function to take best 
advantage of the field fluctuations. The consequent lowering of the 
energy has been shown to be more than ?/w and a not unreasonable 
estimate gives 3 log 2 hw=2-08hw. In the second place, the very fact 
that every fluctuation in the polarization field is accompanied by a change 
in the electron wave function implies also a change in the restoring force 
exerted by the mean electron charge density on the polarization. Thus 
the fluctuations will in fact differ a little from those in Hartree approxima- 
tion. The correction to be made on this account is just the localization 
energy arising from the artificial localization around a fixed space point 
entailed by the Hartree approximation. The reason is the following. 
Any fluctuation accompanied by a change in shape of the electron cloud 
is accompanied by a strong restoring force, while any fluctuation accom- 
panied only by a translation of the electron cloud induces no restoring 
force whatever since the Hartree states at different points in the crystal 
are degenerate. Quite evidently the highest order effect will arise from 
the mixing in of these degenerate states. The changes induced in the 
zero point motion through changes in shape of the electron cloud will not 
give contributions to the same order. Thus we are dealing here just with 
the effects of localization. The calculation of the localization energy can 
be carried out using Hohler’s method which is based on the superposition 
of localized states (§7). However, it can also be obtained by very plausible 
and simple considerations based on the notion that the localized state is 
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nothing other than a polaron wave packet (§ 4.7), and it is expected that 
Hohler’s method, taking due account of the adiabatic approximation of 
§ 5.4, would lead to the same result (cf. § 7.6). 

The elegant method of Bogoljubow and Tjablikow is capable of giving 
a rigourous and complete account of the low frequency approximation, 
and merits a very detailed investigation. The present simplified methods 
lead to a result a little different from that obtained by these authors (see 
the remarks following eqn. (5.21)). It is likely that the discrepancy is 
due to their neglect of a subsidiary condition. 

Perhaps the most remarkable feature of the low frequency case is the 
enormous polaron effective mass, which, much more than the rest energy, 
indicates a great difference between the low and high frequency approxi- 
mations. Thus, for «=10, which is if anything somewhat beyond the 
range of the high frequency approximation, the calculations of Lee et al. 
give m*~4-7m while the Hartree approximation gives m*~200m ! (see 
table 2). Until an adequate approximation for the intermediate region 
has been established we cannot say how the transition between these very 
different values occurs. However, for «<6 the high frequency value 
m{1-+-«/6-+-2-0(«/10)?} obtained by Lee et al. should be quite adequate. 
This will cover nearly all practical applications of the theory, as indicated 
by Frohlich (1954). 

Corrections to the effective mass for low frequencies (large «) could be 
obtained by extending the fluctuation energy calculations given in § 5 to 
a moving polaron, using the variational method developed in § 4. It is 
expected that the corrections would be very large for «<20, as indicated by 
Feynman’s work (see eqn. 6.4). We may therefore ascribe the lower limit 
a~20 to the low frequency region. 

The intermediate region has been investigated by Feynman (§ 6) and 
Hohler (§ 7). The meaning of the wavefunctions etc. used by these 
authors is quite clear when w is small or large, as we have seen. The 
status of these methods in the intermediate region is not clear at present. 
Since the behaviour is determined by the magnitude of the oscillation 
frequency w one might expect that an adequate theory would show some 
features characteristic of resonance as w passes from high to low values. 
Such a resonance would of course be concerned only with the shorter 
polarization waves. The exhibition of resonance behaviour by the expres- 
sions obtained by Héhler and Feynman certainly cannot at present be 
ruled out, since the necessary numerical calculations have not been 
performed. Evidently much research is still required. 
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Table 1. Rest Energy/hw 


(i) Variational Upper Bounds for all « 
tei: Lee and Pines (1952), 
Gurari (1953). 


—aa*, where 


a=0:1088, Pekar (1949, 1954). 
a=25/256=0-0977, Frohlich (1954). 
a =1/(37) =0-106, Feynman (1955), Pekar and Deigen 


(1948), Pekar (1954). 


(ii) Variational Upper Bounds for Small « 


—a—I1-57(a/10)?+ ... , Hohler (1955 b). 
—a—1-23(a/10)?+ ... , Feynman (1955). 
—a—0-98(a/ 10)? —0-60(«/10)? —0-14(a/10)4+ .. , Feynman (1955). 
(iii) Non-Variational Estimate for «<6 
—a—1-4(«/10)?, Lee et al. (1953) (see first footnote 
to § 3.4). 


(iv) Variational Upper Bounds for Large « 
—aoa?—b’+ .. where 220-1088, b’> 3/2, Pekar (1949, 1954). 


—a?/3a—(3 log 2+-3)+.. 
=== 0-10602=2:83 Sai Feynman (1955). 
—0:106 o?—-1-5+ .., Hohler (1955 a). 


The exact result for large «is of the form —aa?—b—c+ .. where a 
20-1088 and b-+c is estimated to be not very different from 
3 log 2+3=2:83. 


(v) Other Variational Calculations 


—5-52 for «=5-2, —11:17 for ~=10, —17-56 for «=15, Lee and Pines 
(1953). 


Table 2. Effective mass/m. 
(i) Small « 


1+o/6+.., Gurari (1953). 

1+ a/6+2-0(a/10)?, «<6, Lee et al. (1953) (see first foot- 
note to § 3.4). 

1+«/6+2-5(a/10)?+ .., Feynman (1955). 

2:21 for «a=5-2, 3:96 for a—10, ' 

6°35 for ~a=15, Lee and Pines (1953). 
(ii) Large «, Asymptotic Limit 

o4(8)3/12 —0-020804, Frohlich (1954). 

16a4/817?=0-0200«4, Feynman (1955). 

0-020804, Pekar (1949, 1954). 


(iii) Large a, Calculation based on Feynman (1955) 
1604/81 ar? —402(1-+-2 log 2)/3a+ .. 
=0-0200a4—1-0la?+.. , 
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§ 1. [INTRODUCTION 


OvR understanding of the motion of a fluid under the action of external 
non-electromagnetic forces has now reached a very advanced state both 
experimentally and theoretically over an extremely wide range of con- 
ditions. For liquids and for gases, at all but the lowest pressures, the 
fluid motion can be described by the equations of macroscopic hydro- 
dynamics, according to which the fluid is treated as a continuum. The 
continuum theory breaks down for a gas at extreme dilution, where the 
mean free path of the molecules between interactions is at least of the 
same order as the characteristic length of the system ; here the continuum 
theory is to be replaced by the kinetic theory. Independently, the 
behaviour of an electromagnetic field has been intensively studied. 
Although the problem of the motion of conduction currents in a magnetic 
field has been considered for some time in the magneto-ionic theory, it 
was only in 1942 that a systematic study of the hydrodynamics of a 
conducting fluid immersed in a magnetic field was begun by Alfvén. 
This study has become known as that of magnetodyhrodynamics, or by 
some authors, hydromagnetics ; it is concerned with physical systems 
specified by the equations that result from the fusion of those of hydro- 
dynamics and those of electromagnetic theory. Certain aspects of 
magnetohydrodynamics form the subject matter of the present paper, 
the aim being to gather into a single paper work that is widely scattered 
throughout the literature. It may be of help to the reader to refer him 
also to other complementary works by Alfvén (1950), Lundquist (1952), 
and Elsasser (1955); of the greatest value is the printed record of a 
discussion on magnetodyhrodynamics organized by the Royal Society 
and held on May 5, 1955: Proc. Roy. Soc. A, 233. The papers collected 
there contain also an account by Shercliff on some engineering applications 
of magnetohydrodynamics. 

The theory can be conveniently divided into two parts, distinguished 
by the validity, or non-validity, of the continuum theory. The continuum 
theory can be expected to apply to many systems of interest, the conditions 
for validity being that the characteristic interaction length and time are 
both small on a macroscopic scale and further the radius of particle gyration 
curvature is large on this scale. Exceptions to these conditions will occur 
under certain extreme astrophysical conditions and under certain 
laboratory conditions such as those found in the gas discharge tube. 
These exceptions, although of great interest and potential importance, 
show extreme complication and are far from understood at the present 
time. 

Even in the case of the continuum theory problems of great complexity 
arise. The equations of hydrodynamics are non-linear, and this property 
is carried over into magnetohydrodynamics. A moving conducting fluid 
in a magnetic field is acted on by a force (the Lorentz force) which is the 
vector product of the fluid velocity and the magnetic field. The coupling 
between the fluid velocity and the magnetic field through a vector product 
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means that magnetohydrodynamics must be treated in its full three- 
dimensional form, no reduction to a two-dimensional form being possible. 
The increased complications are not without new phenomena. The most 
significant of these is the general occurrence in magnetohydrodynamics 
of transverse wave fields in addition to the customary longitudinal fields. 
This feature is characteristic of magnetohydrodynamics, and indicates 
that the presence of a magnetic field allows a conducting fluid to withstand 
shear in a more permanent manner than when the field is not there. 

In order to keep to a reasonable length any survey of as wide and as 
rapidly growing a subject as magnetohydrodynamics it is necessary to 
restrict its scope in several directions. And even in its restricted form it 
is inevitable that some work should be omitted. As a general criterion 
we have omitted those parts of the theory in which least general agreement 
exists, and in which the hydrodynamic—magnetic coupling does not play 
a primary réle. The argument is restricted to those systems which are 
amenable to treatment using the continuum theory, and associated 
macroscopic variables. These several restrictions necessitate the omission 
of the interesting work of Astrém (1950), Dungey (1954), and Ferraro 
(1955), and others on plasmas, and also the recent attempts to understand 
magnetohydrodynamic turbulence (see for example Chandrasekhar 1955). 
Further, classical physics is used throughout, no appeal being made to 
quantum theory. Finally we have omitted those aspects of the theory 
which are mainly of astrophysical or geophysical interest. 

The present paper is divided into two parts. The first part is devoted 
to small disturbances. The equations of magnetohydrodynamics are first 
set down and discussed in § 2. The conditions necessary for the magnetic 
force to control the fluid motion are then considered (§ 3). In § 4, solutions 
of the magnetohydrodynamic equations having a wave form are derived 
for both incompressible (§ 4.1) and compressible (§ 4.2) media. Next the 
effect of a magnetic field on the thermal conduction of conducting fluids 
is treated (§ 5), particular emphasis being placed on problems of thermal 
instability. After a discussion of steady conditions (§ 6), the first part 
of the paper ends with some indications of the practical demonstrations 
of the theory that has so far been achieved. The second part of the paper 
is devoted to high velocity (shock) disturbances. In § 8 some features 
of the theory are considered, and this is followed (§ 9) by a discussion 
of the propagation of shocks. Finally, in § 10, attention is turned to 
the recent work on the structure of shocks. The paper ends with a few 
concluding remarks (§ 11) in which attention is drawn to the need for 
more experimental data, and theoretical work of a general nature. 


Part [. Smati DistuRBANCES 
§ 2. Tor MaGnetonypRODYNAMIC EQuaTIoNns 


In treating the continuum hydrodynamics of an electrically conducting 
fluid immersed in an electromagnetic field, the equations describing the 
motion are well known, since they are obtained by combining the equations 
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of conventional macroscopic hydrodynamics with the equations of electro- | 


; ~ : ; ano ; 
magnetism. For a general, viscous, compressible conducting fluid these 
two sets of equations are : 


(2.1) 
1 1 
() 5 = ay + (v. grad) v= F— Gt eas 3” grad div v | 


(in Gaussian units) >. . . . . (2.2) 


ly 1 
(e) fae let Blt ge 


(f) div D=4rq 
(g) div B=0 


J 

We assume also the constitutive relations B=yvH; D=«xE. Further 
for a compressible fluid a relation is also to be assumed between the density 
and pressure. The two sets of eqns. (2.1) and (2.2) are linked by the 


Lorentz force. F=F,,+F,. Tie eee met ot Fe 2* 3) 


The insertion of (2.3) into (2.16) introduces the electromagnetic field 
vectors into the hydrodynamic equations. Written explicitly the Lorentz 
force is 


Eis _ wc [dE q 
Vener aM abe WCE xH] +46. nay 24) 


The three terms represent, respectively, contributions from the conduction, 
displacement, and convection currents. 

From (2.1), (2.2), (2.3) and (2.4) are formed the differential equations 
for the determination of the three vectors v, H, and E: 


Dv _ pe pe [OE gE 
(a) Die eRe Secu oe wena ce ai xH | + 3 


Be grad p+vV2v-+ 4 grad div v 
p 


DH : kpccttas a tee 
(6) Tks ie einen et SN H div v (2.5) 
0E Poop ey Un it a, ) LO 
Bad ee Saal Sea So mag Reo (eda g QS SE H 
(c) ot ooo tee et a(S pee. | 
2 
+= grad q 


CN es es 
4np.o 


456 G. H. A. Cole on 


These are the basic equations of the macroscopic theory ; they are non- 
linear, and in consequence in general extremely difficult to solve. The 
first two equations contain both v and H, but neither contain E. Therefore 
these two equations are to be solved simultaneously for v and H. The 
vector E, if required, is then to be obtained from (2.5 ¢), or more easily 
perhaps, directly from (2.2): no loss of generality results if (2.5) is 
neglected in what follows. Further, systems of physical interest most 
often have zero charge density (¢=0). 


2.1. The Magnetohydrodynamic Approximation 

The non-linear form of the eqn. (2.5 a) together with the coupling between 
the eqns. (2.5 a) and (2.56) makes it very difficult to find solutions of these 
equations. Progress in the solution of these equations has been made 
for conditions in which the magnetic energy is very large as compared 
with the electric energy. Physically this condition amounts to neglect 
of the displacement current ; the approximation has been called the 
magnetohydrodynamic approximation by Chandrasekhar and has been 
used by many authors. Under this approximation eqns. (2.5) become : 


Dt ah pe I : ] 
(a) Dy =F,+ ee: feurl H x H]— 5 ee v 
+ 4v graddivv - (2.6) 
DH 
(b) Di —AV?H=(H. grad)v—Hdivv 


Further simplification of these equations is far from superfluous, and 
three particular restrictions have been found useful. 

First, the fluid may be incompressible. In this case div v=0, (2.6) 
becoming : 


Dv Mm 1 
(a) Di =F,+ is {curl H x H|]— Pat eh | 

DH 33) 
(b) 7% —\V?H=(H. grad)v | 


Secondly, the fluid may be inviscid (v=0); and lastly the electrical 
conductivity may be effectively infinite (A=0). In this latter case, since 
j=0, the electric field vector is 


E=—“[vxH]. uty Sybil Mee eT 


2.2. The Linearized Form 
In order to obtain solutions of the eqns. (2.6) it has so far been necessary 
to linearize them. This has been achieved by an appeal to the conventional 
techniques of perturbation theory. Such an, appeal will be valid if the 
fluid velocity is small, and can lead to either stable or over-stable 
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conditions. The total magnetic field within the fluid, H, is now assumed 
to be related to the externally applied field, H, (often referred to as the 
seed field) according to 

hed oe en et Ae tei es ey pee (2-9) 


where h is the unknown disturbance field which arises from the relative 
velocity between the fluid and the seed field. If H, is given, the insertion 
of (2.9) into eqns. (2.6) gives two equations involving the two unknowns 
vandh; for an incompressible fluid 


Ov be bad 
(a) a + [curl v x v]=F,+ a [eurl H x H]-++ ar [curl Hy x h] 


Pao ay i 
ys [eurl h x Hy|-+ oF (H,.grad)H, 


+vV?2v—erad & 


b) oh 
°) aE +(v.grad)h—AV*h—(h. grad)v=(H,. grad)v— ae 


+AV2H, 
1 2 H.2 
ez (Oe lee) 
c) s=- — — — 
(@) B=2(p4 Fe) 
These equations become linear if either (i) v, and consequently h also, 
can be treated as small quantities ; or (ii) if there is an equipartition 
between the magnetic and kinetic energies 


(2.10) 


ph® 
| $ov? | = ie 


giving " 
vi /(E)p. Tet (11) 


Then the second term on each side of (2.10 a) disappears, as does the 
ais (h. grad)v—(v.grad)h (=curl [vxh]) 


in (2.106). The relationship (2.11) was first proposed by Walén (1946). 
Ferraro (1954) has pointed out that the condition is valid for plane waves 
if the electrical conductivity is infinite: van de Hulst (1951) previously 
reached the same conclusion (cf. § 4.2). 

If H, is constant in time, and homogeneous in space, eqns. (2.10) under 
the linear approximation, become considerably simplified. In terms of 
the components of a Cartesian frame, v,; and h; satisfy the equations 


Ov; - Oh; 070, 

(a) saul =FPyy+ respect On, Tap 

(2.12) 

oh, Oh; 4 Ov; 

ot Tg AS, 

Ifthe disturbance is not small these equations rely on the validity of (2.11) ; 

the applicability of this relation when the electrical conductivity is finite 
is still open to question. 


(0, j=2, Y; z) 
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The passage of an magnetohydrodynamic disturbance affects the fluid 
pressure. For, taking the divergence of (2.10 a) it follows that : 


ane fa 
p++ 5 (Hy. H,)+ ¢ (h. Hy)=constant. 
If py is the static pressure, and Ap=p—po 


Ap=—E{ Pot +3(h Hs) son dee 
87 | pw 

The first term represents the contribution of the kinetic energy ; the 
second represents the contribution of the transfer of the magnetic energy 
through the disturbance, the sum of which over the whole disturbance 
vanishes. If equipartition holds, the first term can be replaced by its 
equivalent ph?/87. It appears that usually the pressure is reduced by 
the passage of the disturbance, the effect then depending on square terms 
of the velocity. 


2.3. The Symmetric Form 


The magnetohydrodynamic equations have in general an uncompromising 
appearance, but it is possible to write them in a symmetric form if they 
refer to an incompressible fluid. This was shown by Elsasser (1950 a) 
and independently by Lundquist (1952). To achieve this transformation 
the three new vectors V, | and m are defined according to : 


(a) V=+/(47p)H (6) l=v+V : m=v—V. (2.14) 
Insertion into (2.7) gives 


D 
(a) 7 =F,-+[curl V x V]— “grad pt+vVv 
se (2.15) 
(b) ine =(V. grad)v+AV2V 


Using the vector expansion for the second term on the right-hand side 
of (2.15 a) and alternatively adding and subtracting the two equations, 
there result two equations involving 1 and m: 


al 
(a) 7 +(m. grad)l=Fy—grad &,+-42V21+48V2m 


is \ (2.16) 
(6) ap + (1. grad)m—=F,—grad 3, + aV9m-+ 3pV4l 
with 
(a) a=v+A : B=v—A 
ki 2.17 
(6) 5,=(1/p)(p-+4pV?) “a 


The eqns. (2.16), which bear a marked resemblance to the Navier-Stokes 
equation, show a complete symmetry of form, one being obtained from 
the other by an interchange of the vectors ] and m. 
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This form of the magnetohydrodynamic equations has been used by 
Lundquist (1952) to suggest a condition under which equipartition between 
kinetic and magnetic energies might be valid. If &,and &q are respectively 
the sum and difference of the kinetic and magnetic energies : 


6 =tp(?+m?), 
& a=tp(1.m). 


Consequently &, is zero if 1 and m are perpendicular. This condition 
does not uniquely specify v and V, so that equipartition between the 
kinetic energy and total magnetic energy would appear to be valid under 
a fairly wide range of conditions, controlled by Hy. In the general case 
it would appear that (2.11) is valid if H, is small, and h large, or alternatively 
if the special condition H,=— 2h holds. 

If the disturbance fields themselves are required the equations can be 
rearranged to yield them, although at the expense of complete symmetry. 
Introduce the disturbance vectors I’ and m’ according to 


(a) l=V'+V, i 
(b) m=m’—V, 


vo= (4) Ms Lise, TREN) 


A physical interpretation of V, will be seen later (cf. § 4). 1’ and m’ now 
satisfy the equations : 


(2.18) 


where 


y 


(a) 4 +(m’. grad)l’—(V,.grad)l’—F,+ grad 2, —3«V?1'—38V2m’ 


0 
= {(¥, . grad) —(m’ grad) —Ja¥?+3802—5 1 V, 
(b) ——-+(I’ . grad)m’+(V, . grad)m’—F,+ grad £,—4aV2m'—3pV1' | 


0 
is 1, . grad)+ (1’. grad)—40V2+48V?2+ at Vo 
. (2.20) 


If the seed field Hy, and consequently also Vo, is independent of the 
position and constant in time, the right-hand side of each equation 
vanishes ; in this case the resulting equations have a form very similar 
to (2.16). There is not complete symmetry, however, the sign of the 
third term in each equation being different. 


2.4. The Vorticrty 


In the absence of a magnetic seed field the fluid flow may be either 
irrotational or rotational, the distinction being permanent in time. The 
introduction of a magnetic field causes the irrotational motion to break 
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down. The equation of motion (2.6 a) can be arranged to include the 
vorticity vector § (=curl v) : 


1 1 
te =curl Fy-- = curl {2 [curlH x Hi —curl {- grad r} 


Dt 
+vV2E-+ dv grad div &. 
Owing to the presence of the field it never happens that D&/Di=0 even 
for the special case of an incompressible, inviscid fluid acted on by a 
conservative hydrodynamic force ; the condition only holds in the limit 
of vanishing magnetic field strength. Accordingly, irrotational flow 
cannot be a permanent feature of magnetohydrodynamic flow. This 
conclusion is particularly important in connection with the application of 
perturbation methods. 

Equation (2.6 6) for an incompressible fluid is : 

ae =(H. grad)v--AV4H. suse nae 
This equation is the exact analogue of the vorticity equation of conventional 
hydrodynamics (Batchelor 1950), the latter equation resulting if § replaces 
H and the kinematic viscosity v replaces A. For this reason A is sometimes 
called the magnetic viscosity. By analogy with the conventional case 
the effect of the magnetic viscosity is to disperse any disturbance from 
the form it would have in the absence of viscosity. This follows from (2.21) 
for DH/Dt cannot in general be zero, any region initially devoid of a field 
remaining so only ifA=0 (c= 00). Accordingly if the electrical conductivity 
is infinite the magnetic field is linked rigidly with the fluid ; the magnetic 
field is now said to be frozen into the fluid. In this case the flux linked 
with a circuit in the fluid is constant. 

The analogy between (2.21) and the vorticity equation of hydrodynamics 
means that the conventional vorticity theorems remain valid in magneto- 
hydrodynamics provided they are interpreted in a way appropriate to 
this new case (see, for example, Elsasser 1950 b, 1955, 1956). 


2.5. Boundary Conditions 

In applying the equations set down so far to physical systems it is 
necessary that the solutions should satisfy specific boundary conditions. 
In magnetohydrodynamics the boundary conditions are especially critical 
—particularly those associated with electromagnetic components. These 
conditions enter through the conduction or insulation properties of the 
boundary surfaces, so that a full specification of the physical system is 
indispensable. In particular, the solution applicable to one physical 
system can be applied to another only with the greatest caution. This 
applies also to systems of infinite volume, since any local irregularities, 
such as turbulence, will introduce specific boundary restrictions. 

The boundary conditions for a problem follow from the requirements 
of conservation of mass and charge, together with artificial constraints 
such as the vanishing of normal velocity component, or the maintenance 
of a steady temperature, 
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§ 3. THe Maanetic Force 


Before passing to solutions of the eqns. (2.6) it is of interest to consider 
the conditions under which the magnetic force is to be expected to 
predominate over the non-magnetic forces. 

It follows from the electromagnetic equations that the electric field 
energy is to be equated to v?/c? times the magnetic field energy. The 
same relation links the displacement current and the total current. If 
v<c the electric field energy can be neglected in comparison with the 
energy of the magnetic field. This is valid in the non-relativistic case, 
and is the condition usually involved in magnetohydrodynamics. It 
then follows from (2.2 ¢) and (2.2 ¢) that the magnetic force, F',,, has the 


order of magnitude Fy ~ ovH,2. Pei ep en Ee EE) 


The inertia force, F;, has the order of magnitude F;~pv?/L, where L is 
a characteristic length. Consequently 


Fy R pu? ie Pe. 1 

ear ta Lisool oc obi 
The Coriolis force, F,, has order of magnitude F',~pvQ, where Q is the 
angular velocity of rotation. Therefore : 


Ds a ee pvQ p82 


ide eco iot ole 
The pressure force, F’,, has the magnitude F,,~pV?/L, where V, is the 


local velocity of sound, so that 
Th beg es WUD ai 
en pee Deo NY aad DEA 
Finally, the viscous force, F';, has the order of magnitude Fy~pvv/L? 
where v is the kinematic viscosity. Then : 


(3.2) 


(3.3) 


(3.4) 


EYES ee el UPL (3.5) 
iw AeieNen 2iG017 ae Not bdt, loa ak) 
The non-dimensional numbers Ff, R,, &, and Ry are characteristic numbers 
playing a réle analogous to that of the Reynolds number in conventional 
hydrodynamics. In order that the magnetic force should control the 
motion these numbers must be very small, ideally zero. Study of 
eqns. (3.2) to (3.5) shows that for a general rotating conducting fluid 
immersed in a magnetic field the magnetic force will predominate over 
the others if :— 
(i) The density is sufficiently small. 
(ii) The kinematic viscosity is sufficiently small. 
(iii) The magnetic seed field is sufficiently large. 
(iv) The electrical conductivity is sufficiently large—ideally infinite. 
(v) The characteristic length is sufficiently large. 


The strongest interaction between the magnetic field and the conducting 
fluid is clearly found in an infinite non-rotating inviscid fluid of small 
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density and of infinite electrical conductivity. This conclusion has been 
summarized in a single relation (Lundquist 1952, Bullard 1955). Such a 
strong interaction between the field and the fluid will require first that 
the field is ‘ frozen ’ into the fluid so gripping it firmly (which requires o 
to be large), and second that the magnetic and dynamic forces are of 
comparable strength (giving a relation between p and H,). These two 
requirements are expressed analytically respectively as : 


v>(oL)-1 and v=Hyp-™. of es 
Consequently the condition for strong interaction may be written : 
H,Lo 


Se eC Ac 


For mercury, p=13-6 g/cm’ and o=5 x 10-* e.m.u., 
Hl he ee erermeree re | 


If H,~104 gauss, L>7cm. For complete coupling L must, therefore, 
be of the order of 107, and since the field must be uniform the laboratory 
reproduction of such coupling presents extreme problems (cf. §7). For 
a real ionized gas the density is now reduced (at N.T.P.) by a factor 104, 
but the conductivity is also reduced by a factor 10-10%, so that the 
condition (3.8) again applies with essentially the same numerical factor. 


§ 4. MAGNETOHYDRODYNAMIC WAVES 


The theory so far developed has involved the manipulation of well- 
established equations, the modern contribution being the adoption of 
the magnetohydrodynamic approximation. The significant feature of the 
modern work is the initial recognition by Alfvén (1942) that solutions of 
the magnetohydrodynamic equations exist which have a wave character 
even if the conducting fluid is incompressible. This suggests the possibility 
of propagating energy in such fluids at a velocity in excess of the mean 
fluid velocity. This is an important conclusion since in the absence of 
an external magnetic field, the situation treated by conventional hydro- 
dynamics, such a possibility does not exist. The presence of the field has 
allowed a new mode of energy propagation. The waves in the body of 
an incompressible conducting fluid immersed in a magnetic field are now 
usually called Alfvén waves, and are a special case of magnetohydro- 
dynamic waves. They are found to be transverse, and are believed at 
the present time to be of importance in the understanding of certain 
astro- and geophysical phenomena. 

In the present section the solution of the eqns. (2.6) having a wave 
character, the magnetohydrodynamic waves, are examined for both 
incompressible and compressible fluids. In the latter case it will be found 
that the conventional mode of wave propagation (sound waves) is both 
affected by a magnetic field and supplemented by other modes, 


Some Aspects of Magnetohydrodynamics 463 


4.1. Incompressible Media—Alfvén Waves 


In his initial work Alfvén (1942) made the following eight assumptions : 
(i) the fluid is incompressible ; (ii) the fluid is inviscid ; (iii) the electrical 
conductivity is infinite ; (iv) the fluid velocity is small ; (v) the disturbance 
field h is small; (vi) the seed field H, is constant in Space and time ; 
(vii) the magnetohydrodynamic approximation is valid; and (viii) the 
fluid volume is infinite (no boundary conditions). The equations of the 
motion are consequently (2.7) with v=A=0; further, external hydro- 
dynamic forces are assumed zero, Fy=0. Differentiating the equations 
with respect to the time and rearranging gives : 


070, Re. Ke 2 02, 
Dee agp to Beep 
Oh, Bay gh; 
(0) Cf 4irp 0j Ox? 


(4.1) 


These are the equations describing a transverse wave travelling with 
velocity Vo given by Ye Sen dari Saks ore) 
This is identical with the previous eqn. (2.19) and provides its physical 
meaning. The direction of propagation of the Alfvén waves can be parallel 
or antiparallel to the externally applied field. The waves propagate 
without dispersion. 

The velocity with which the wave disturbance propagates is seen to 
increase with the seed field, and to increase with decreasing density. For 
p=1 g/cm, »=1, and H,=1000 gauss, Vo is roughly 3x 10? cm/sec. This 
is a characteristic magnitude, being very much less than the velocity of 
light. As the fluid density becomes very small the magnetohydrodynamic 
velocity becomes increasingly large, though the possibility of a singularity 
must be excluded. As the density decreases the displacement current 
becomes increasingly important, ultimately taking control over the 
conduction current. Because the magnetohydrodynamic approximation 
amounts to the neglect of the displacement current it becomes invalid ; 
as the density decreases below a certain (somewhat arbitrary) limit, the 
eqns. (4.1) then cease to be sufficient. It has been shown by Rydberg 
(1948) that as the density decreases the magnetohydrodynamic disturbance 
becomes more electromagnetic in character, ultimately becoming a pure 
electromagnetic wave in the limit of vanishing density. 

It is readily shown that (2.5 b) may be written in the form 


i we) 07H 


Vi (= H?) 


which represents a wave travelling with velocity 


1 4mp\-12 We 
V= € +t) > . . . . . ° ( ) 


P.M, SUPPL,—OCTOBER 1956 2K 
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which clearly includes both the case when the displacement current is 
important (V->c), and the case when it is not (V+V,). The neglect of 
the displacement current does not allow this to be apparent from (4.1) 
alone. According to (4.3) a magnetohydrodynamic wave can be regarded 
as an electromagnetic wave travelling through a medium of dielectric 
constant K =(1-+-4:pc?/H,")'””. 

The general solutions of (4.1) are two waves travelling in opposite 
directions along the field direction fi : 

v 

h 
where w is the angular frequency and # is the propagation direction. 
For a plane wave the functions f and g are sinusoidal. _With v and h 
determined from (4.1), the associated electric field follows from eqns. (2.2), 
and the effect of the disturbance on the initial pressure follows from (2.13). 
Walén (1949) pointed out that the solutions just derived also apply if 
the restriction to small amplitudes is relaxed, provided that the relation 
(2.11) is valid. The solutions now, however, are particular solutions of 
the equations. It appears that the present case of an incompressible 
fluid is the only one for which the disturbance field h can be of the same 
order, or be greater than, the seed field Hp. 

If the conditions of infinite conductivity, or of inviscid flow, are relaxed 
the position is extremely complex. If only one restriction is relaxed, 
however, the disturbance can be treated as a damped harmonic wave. 
The presence of both dissipative coefficients, however, makes it impossible 
to treat the disturbance as a simple wave. This is seen as follows from 
(2.12), with F,—0. Rearrangement gives the two equations : 


Ov, Hj? Ov, Pv; | wo; , Ph; 


\ flat —2n(h. ¥))-+gluts-2n(h ee aaa 


(4) OR Arp Ox?  aton,? 8 4p aap 
4.5) 
07h; pH," 07h; Oh; Ov, ( 
Db) ee eect 1 aie 
O)- 38 ~ aap Bal Th ees eee one 
If v=A=0 these equations reduce to (4.1). If the fluid is inviscid (v0), 
5 1s 
(4.5 b) becomes : Ah; pH? Oh, ah, 
ot? dip ox,? tox? 
to be coupled with (4.1 a). Choosing 
h=hy exp {ax;+twt} 
V=Vy exp {aa;+iwt} 
it follows that the exponent « is 
tw twa\ —1/2 
“= + V, (1 +- ra) . . . . . . (4.6) 


For strong damping, therefore, the attenuated disturbance field cannot be 
regarded as having damped sinusoidal form ; in consequence it cannot be 
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assigned a true wavelength. At the low frequency range, however, or 
if the damping is small (Aw < V2), 


mee ee iwr 
" aii V3)" Pe ene , ee(4.0 C) 
ritin 
8 Bea VAS ye Nee nine ore vss bee SE yy (4.77) 


v and h become, with the z-axis chosen to be along the field : 


(2) v=v, exp {—2/2y} exp {iw(t—z/V)} ee 
(6) h=hy exp {—2/z,} exp {tw(t—z/V,)} : 
with v, and hy related according to : 
Vo ( twr ) 
Vo=—sll—saa)h . - - (49 
‘S Hy 2V pw fe 


When A->0, (4.9) is in agreement with (2.11). The same results apply if 
A=0 but v0, representing a viscous fluid of infinite electrical conductivity, 
1/v now replacing A in the formulae. Again it is possible to represent the 
disturbance as an attenuated harmonic disturbance. In each case the 
wavelength is equal to 27Vjw-!. If, on the other hand, both A and v are 
simultaneously non-zero and finite, representing a general fluid, it is not 
possible to assign the disturbance any simple damped harmonic form. 
It appears from this work that a true Alfvén wave can be propagated 
only if the dissipative effects are at least small, and if the seed field is not 
indefinitely large, thus making the magnetohydrodynamic approximation 
insufficient. There is also the further restriction that the frequency must 
not be too high. The attenuation of Alfvén waves in an infinite fluid of 
finite electrical conductivity has been considered by Roberts (1955 a) 
who obtained both periodic and aperiodic solutions of (2.6). A feature 
of this work is the employment of the Heaviside operational calculus 
which allows the aperiodic disturbances to be treated without approxi- 
mation. The reader is referred to the original paper for details of the 
work. Hide (1955) has treated in detail the passage of magneto- 
hydrodynamic waves through a medium of varying density. 


4.1.1. Superposition of the Waves 


The equations of motion are non-linear so that it is not possible to apply 
generally a method of superposition to magnetohydrodynamic waves even 
though the amplitudes involved be small. Parker (1955) has pointed out, 
however, that a superposition principle can be applied in the limit of 
perturbations small in comparison with the magnetic energy of the seed 
field. 

In the linear approximation of the eqns. (2.7), for instance when the 
equipartition expression (2.11) applies, it is a valid procedure to superpose 


2K2 


466 G. H. A. Cole on 


amplitudes, but even here the pressure field cannot be simply superimposed 
(Lundquist 1952). This follows from (2.17 6). For 


2 
pay=pt+ wee =constant— (say) Pp. 


Suppose there are two solutions of the wave equation written (v,, h,, 23) 


d (v,, hy, ps) then 
and (Vo, hy, ps) d ee 


mi LH a ee see Hits af 
Poyy=Pit co =Pq : PH y2=Pot = =Po-: 
For the composite wave 
o> led 
Pt o=Piot gq (Hit H2)"=po 


that 
Ditlade P12=P1+P2—Po—(u/47)(Hy « Hy). - + + (4,10) 
This shows that the composite pressure is smaller than the sum of the 
‘separate ’ pressures. 


4.1.2. Reflection and Refraction 


The reflection and refraction of magnetohydrodynamic waves at a 
boundary surface between two conducting media can be calculated by 
applying essentially the same procedure as is applied in the corresponding 
theory of electromagnetic radiation. The problem was first treated, 
rather briefly, by Walén (1944, 1946) and Alfvén (1950) in connection 
with their theory of sunspots, and later by Lundquist (1952). The problem 
has recently been treated much more fully by Ferraro (1954) and Roberts 
(1955 b) who considered the effect of a plane boundary between two 
semi-infinite inviscid incompressible fluids of infinite electrical conductivity. 
Whereas Ferraro considered waves polarized at right angles to the plane 
of incidence, Roberts in his paper did not restrict the incident wave to 
any special polarization. 

Suppose the subscripts 7, r, re to refer respectively to the incident, 
reflected, and refracted wave; and further suppose the incident and 
reflected waves to move in medium 1 with density p,, with velocity V 7 
the refracted wave, with velocity V,, moving in a medium of density po. 
If u; is the normal vector to the wave fronts, the plane wave is 


h=A,;exp {iw (‘- = ")t 


A; and u, are perpendicular since (A;.u;)=0. Ifin a Cartesian coordinate 
frame the (ay) plane lies in the surface of separation. (z=0) and if the 
(y-z) plane is parallel to the field, the disturbance fields have the form 


4 ; Vy 


(b) h,=A,, exp {ie (1 =) 
2 


(4.11) 


Some Aspects of Magnetohydrodynamics 467 


where / and m are related to the angle between the field and the z-axis, 
B according to /=cosB and m=sinf. Also div h,=div h,=0. The 
boundary conditions across the surface of separation are the continuity 
of electric and magnetic fields, and of the normal resolute of the velocity, 
and of the pressure. These conditions of continuity lead at once to the 
relations between the wave amplitudes. 


aa Vp 2—V Pi = 9 

A, TET ee and A,,.= har rr ee ee AS 

Ferraro in his paper inferred that only those waves can be reflected or 

refracted for which the associated magnetic field and particle velocity 

are parallel to the plane of separation. Roberts, however, has shown 

the incorrectness of this conclusion. Consequently whereas Ferraro 

restricted his arguments to waves polarized at right angles to the plane 

of incidence, Roberts imposed no such restriction. One interesting result 

of the theory is that the boundary can be expected to be undisturbed by 
the reflection—refraction process. 

The relations between the incident, reflection and refraction angles also 
follow from the boundary conditions. It emerges from the theory that 
the incident, reflected and refracted disturbances all lie in the same plane. 
With the (xy) plane in the surface of separation, we let the angle between 
the incident plane and that containing z and Hy be y. If angles are 
measured from the plane of separation then Ferraro and Roberts both 
find 


24/ pe 


(a) tan $,+tan 3,——2 cot 6 cos y 
V (4.13) 
(0) tan 3,,= Me tan 9,+(-=+ —1 ) cot 8 cos y 

V3 Vs 
as respectively the laws of reflection and refraction in magnetohydro- 
dynamics. It is seen that the orientation of the field relative to the 
incident plane is important. If the magnetic field is normal to the 
separation surface (B= 47), or the plane of incidence is perpendicular to 
the (y-z) plane (y=47), eqns. (4.13) take on the simple form 


(a) tan9,=—tan 9, (b)tan 9,,.= (V/V) tan 9,. . (4.14) 


While the equality of incident and reflected angles holds now, as in opties, 
the tangent law replaces the optical sine law. 


4.2. Compressible Media 


According to conventional hydrodynamics, energy can be transported 
in a compressible conducting medium without an electromagnetic field 
through the agency of longitudinal acoustic waves, but transverse waves 
are not possible. The introduction of an external magnetic field is found 
to introduce a magneto-acoustic coupling ; furthermore the fluid is now 
able to support transverse wave fields. These longitudinal and transverse 
disturbances have been considered by Herlofson (1950) and independently 
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in more detail by van de Hulst (1951). The necessary equations have 
also been derived by Lundquist (1952). We turn our attention briefly 
now to this work, the reader being referred to the excellent paper of 
van de Hulst for full details. 

The displacement of a fluid element during the time interval (t,—7});, 
r, is related to the velocity according to . 


r= | vit oft 1 a Sn ae 


Further the pressure is related to r 
p=Gidivr, «te. 0 ie eee ee 


where Y is the compressibility. Confining the discussion to the case of 
small amplitudes only, but including also the displacement current and 
the effect of finite electrical conductivity. The insertion of (4.15) and 
(4.16) into the full eqn. (2.5 a), using also eqns. (2.2) leads to an equation 
for r ae ee " 
= _ LEX Hol + 3 {[v x Hy] x Ho} — ¥ grad div r 


Ae fe 0 
+ prerad div (3) + pv# ($F) ek aes 


Suppose the z-axis of a Cartesian frame is chosen to be along the direction 
of wave propagation, the seed field lying in the (ay) plane. Suppose the 
seed field, Hy, to make an angle 9 with the propagation direction. In order 
to simplify the treatment consider the case of plane waves, so that we 
seek solutions of (4.17) proportional to the factor 


exp {i(wt—kz)}, 5 GES. eee eee) 


where w is real, and & is the wave number. In this way the problem is 
reduced to that of determining the possible values of k in terms of w and 
the dissipative coefficients v and A, thus giving the number of possible 
vibrational modes. 

Van de Hulst has listed five distinct modes of vibration for each arrange- 
ment of the field. If the external field is zero, k satisfies one of three 
expressions : 


POR 


3g" 
2 1/2 
(b) b= + | ide | a dette a aiEO) 


bea (2) 


These expressions apply respectively to a damped sound wave, a damped 
electromagnetic wave (with »=1), and a hydrodynamic wave controlled 
by viscosity—van de Hulst has termed this a viscosity wave. This last 
mode has zero wavelength in an inviscid fluid. Two further modes are 
possible, and are identical with (4.19 and c), apart from an inessential 
difference of the plane of polarization. 


@) ee (eee 
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Suppose next that the external field is along the propagation direction. 
In this case H),~0, H,,—0. It seems that now the pure acoustic wave 
in unaffected by the field ; this is to be expected since the (longitudinal) 
vibrational direction is in the direction of the field making a material- 
magnetic field coupling physically most unlikely. The remaining four 
modes are coupled, and follow from the fourth order equation : 


pc2v Gi wpv fl. Tose 4ir pv 
k4 sae cn Saeed ae p 2 
oH ,"w v mae cH) ai Ga ma) 


ey twpc* dirpc*| ¢ 
{x (14 oP) Fe = Oe) ae oS (4:20) 


The four modes represent damped transverse waves. If all the damping 
terms are zero (v=A=0), (4.20) describes a wave travelling either parallel 
or antiparallel to the field, with a velocity 


oo K 4irp —1/2 
: ze & as =) 


and so becomes an electromagnetic field in the limit of vanishing density, 
and an Alfvén wave in the limit H),->0o. With the dissipative terms 
present it is not possible to present the solutions simply in uncoupled 
form. 

Some coupling still remains if the magnetic field is made transverse to 
the propagation direction (H),—0; Hp), 40). In this case there is an 
unaffected viscosity and an unaffected electromagnetic wave. The second 
viscosity wave is slightly affected only ; the two remaining modes are 
coupled, satisfying the equation 


Wet fink po?) 6 Hat (8 
we Sto) G9 SOmUn t/4ro aslo rae G Nw 


In particular the acoustic wave is affected in this case, being coupled with 
a longitudinal wave having a velocity formally the same as an Alfvén 
wave. 

If all the damping terms are zero (v=0, A=0) the equations of motion 
lead to possible values of the wave number which allow simple interpre- 


tation. The equations for / are : 
2 


ae Kil Hoy! 
Oye Ca men tara 


02 02 
‘ (Hip ,4ap 1 = 5 Fee cs (1 ae) . (4.21) 
© S(nse tHe a)~ ane (a 
Three modes correspond to an electromagnetic wave, a sound wave and 
a true Alfvén wave. Further, if the electrical conductivity is zero, 
ordinary undamped electromagnetic and sound waves result. 

When the dissipative terms are absent, van de Hulst gives relations 
between the appropriate energy densities. His results show that : 
(i) For an electromagnetic wave there is an equipartition between magnetic 
and electrical energies. (ii) For an acoustic wave there is an equipartition 


between the kinetic and potential energies. These are both well known 
results. (iii) For an Alfvén wave there is an equipartition between the 
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magnetic and kinetic energies. This result confirms (2.11) and suggests 
that the condition of equipartition in an ideal conducting fluid may be 
regarded with confidence. Two further modes remain, the exact form 
of which depend upon the field strength. If the field is sufficiently small 
so that V,<V, (V, being the sound velocity) the mode corresponds to 
a normal (transverse) Alfvén wave. As V4 increases, the mode changes 
to essentially a retarded sound wave, that is a wave travelling more slowly 
than normal sound, but having an equipartition between the kinetic and 
potential energies ; if electromagnetic energies are very small. It can be 
regarded as a longitudinal wave with respect to the magnetic field direction. 
As the magnetic field increases still further so that V)>V, the magnetic 
energy takes the place of the potential energy, the wave becoming trans- 
verse: but this statement again is made with respect to the magnetic 
field, and not to the propagation direction. 

It is seen then that the interaction between a magnetic field and fluid 
flow can introduce some highly complex situations. The most character- 
istic feature of the coupling is the existence of transverse wave systems 
in the fluid, the magnetic field enabling the fluid to become capable of 
withstanding shear. In this respect the magnetohydrodynamic case 
differs from the associated hydrodynamical case. The presence together 
of both longitudinal and transverse modes of vibration in the case of a 
compressible fluid make it natural to treat the conventional hydrodynamic 
theory as a special case of the magnetohydrodynamic theory. This 
position is hard to substantiate in practice, however, since so far there 
do not seem to exist any general theorems in magnetohydrodynamics 
that pass to the useful and well-known counterparts in the limit of 
vanishing field. 


§ 5. MAGNETOTHERMAL EFFECTS 


5.1. The Basic Equations 


The conduction of heat in a moving fluid depends upon the fluid velocity ; 
if the fluid is also an electrical conductor the heat flow will in addition be 
disturbed by the presence of an external magnetic field. In order to 
include this effect into the theory, the magnetohydrodynamic eqns. (2.6) 
must be supplemented by the equations of heat conduction. If 7 is the 
local temperature, and « the coefficient of heat conduction (assumed now 
constant in time, and uniform in space), and if the fluid is rotating about 
an axis @ with angular velocity Q, the full equations of motion are: 


Dv Fie 
(a) p Di =pF,+ re [curl H x H]—grad p+ 2p[v x Q] 


+ [[&x r] x Q]-+-rV2v-+ dv grad div v 


DH 
(0) > =)V8H | (H. grad)v eA F 


DT 


ee Ae! 
(c) Di KV24 | 
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The distribution of heat sources must be further specified ; and if the 
fluid is compressible the relation between pressure, density and temperature 
must also be given. 
The local temperature is related to a datum temperature T', (for example, 
applying at the origin of coordinates) according to 
T=T,+AT Nae eek (Bre 
provided the distance is not too great. AT’ has two contributions, viz. : 
a contribution controlled by the gradient of temperature ; and also a 
magnetohydrodynamic contribution of fluctuating form. The simplest 
situation is a linear temperature gradient | 8 | in the direction of the unit 
vector § ; if @ describes the fluctuations then 
AT=—P(s.r)+6. Men essa (5,2) 
r is here the vector distance from the region of temperature 7’, to the 
point with temperature 7’. The insertion of (5.3) and (5.2) into (5.1 ¢) 
provides the equation determining the unknown function 6. If the 
hydrodynamic body force acts in the direction &, so that F,=kF 4, and 
the seed field acts in the direction 7, so that Hy=AH,7, then, if | H, | is 
constant in both space and time, the equations of motion (5.1) become : 


D A . 
(a) Pa =pkF y+ i. {(Hy . grad)h-+ (bh . grad)h} 


| 
—grad {p+ &(H : H)} + 2plvx2 | 


+[[Q x r] x Q]+ prV2v-+ p4v grad div v i (5.4) 
(b) = =AV*h-+ (H, . grad)v+ (h. grad)v 
(c) - —f(v.grad)(§. r)—B(8.v)+«V?0—xBV?(é .r) 
with the auxiliary conditions 
divh=0 and div v=— a : he Pe the (DeD,) 


ot 


These equations apply quite generally to a viscous compressible fluid in 
rotation and immersed in an externally maintained constant magnetic 
field. They form a very complex set of equations, and actual progress in 
their solution has been possible only if simplifying assumptions are invoked. 

The two obvious restrictions are those of incompressibility, and of only 
small deviations from equilibrium : the first restriction gives div v=0, 
and the second restriction allows squares and products of the variables 
v, h, and 6 to be neglected. The equations of motion (5.4) become 


(a) p = =—phl a i (H, . grad)h—grad 5+ 2p[v x Q]-+ prvV2v 
| , (5.6) 


00 2 
(b) a =)\V?h-+ (H, . grad)v (c) ry =Bf(s . v)-+«V20 
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with 


E=p+ —(H.H). a SY 


Although the fluid is incompressible, the density variation with temperature 
must be considered. If py is the density when 7'=7, and « is the coefficient 
of thermometric expansion 

p=p)(l—«AT). > > ome) 
AT (=T—1),)isgiven by some expression such as (5.3). If 47’ is sufficiently 
small, p differs from py by a term of the first order only ; consequently when 
allowing for this thermal effect in (5.6 a) it is necessary to distinguish 
between p and py only in the term including the body force. This is usually 
called the Rayleigh approximation in ordinary hydrodynamics. To this 
approximation (5.6 a) becomes 
Ov 
Se 
with 


kF_y—cAThFP y+ ey (H,.grad)h—grad 3,-+2[v x Q]+vV2v. (5.9) 
0 


teeine be 
2- (n+ 2H .H)). 


The eqns. (5.6) and (5.9) are to be solved according to the appropriate 
boundary conditions ; since these will involve the normal components of 
velocity, and the value of 6 at the boundary, the conditions to be imposed 
will generally depend upon whether the surface is free or rigid. 

In applying the equations to specific systems a succession of V? operations 
are used to resolve them to equations containing v, h, and @ alone. 


5.2. Thermal Instability 


Of especial interest from meteorological and astrophysical viewpoints are 
the conditions affecting the onset of fluid instability in the presence of a 
heat gradient. Problems for plane horizontal fluid layers without magnetic 
fields have been the subject of careful experimental and theoretical work, 
starting with the initial experiments of Bénard (1900) when the rectangular 
and hexagonal convection cells were recognized. Rayleigh (1916) first 
considered the theoretical aspects of the work, and the problem was also 
treated by Jeffreys (1926); the problem was essentially solved later by 
Pellew and Southwell (1940) using a variational procedure. 

For finding the conditions when instability sets in, i.e. the condition of 
marginal stability, the principle of exchange of stabilities is invoked 
according to which any time dependence has an exponential form with 
essentially real exponents. If the exponents are complex, over-stability 
can arise under certain conditions, any small disturbances now growing 
continuously in amplitude with time. To obtain the equations governing 
marginal stability all disturbance vectors are taken to be small, and the 
time variations are equated to zero. The procedure is to determine the 
lowest value of the Rayleigh number (defined by (5.21)), and so the heat 
gradient, for a given form of cell, the lowest of all the possible values of 
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such Rayleigh numbers determining both the critical temperature gradient 
and the form of cell most likely to arise. The treatment for rectangular 
cells was considered by Jeffreys (1926) and that for hexagonal cells by 
Christopherson (1940) ; consequently Pellew and Southwell (1940) were 
able to treat both circumstances. 

_ Chandrasekhar has applied and extended the Pellew and Southwell 
treatment of the classical Rayleigh—Jeffreys problem to the case of a 
conducting fluid immersed in a magnetic field. In a series of papers he 
has considered the case of a plane horizontal fluid layer with an internal 
heat gradient immersed in a magnetic field, and also has taken account of 
the possibility of rotation. In order to obtain the equations of marginal 
stability Chandrasekhar invoked again the principle of exchange of 
stabilities and considered the conditions necessary for its validity. He 
found that the conditions were controlled by the relative values of the 
kinematic viscosity, v, and the thermal conduction, x. The exact form 
depends on the problem in hand, but the type of restriction is v<« and 
this may be both a necessary and sufficient condition (Chandrasekhar 
1952 a). If it holds convective instability is possible, whereas if it does 
not over-stability will arise ; these conditions seem to refer respectively 
to geophysics and astrophysics. 


5.2.1. The Equations 
Consider a horizontal plane fluid layer of thickness /, with a gravitational 
force acting vertically downwards. The fluid layer is heated from below, 
the temperature gradient being vertical (¢=8). The equations for marginal 
stability are then obtained from (5.6) and (5.7) : 
(a) grad E=yko LpV2v-+ 2.Q[v x a]+ an (7. grad)h 


0 


(b) 7V2h=H,(7.grad)v _ (6.10) 
(c) xV20=—Bf(v.k) 
(d) divv=0 ; divh=0 


where according to convention y=g«. Following Rayleigh it is convenient 
to use as variables the normal component of the velocity and of the 
magnetic field disturbance 


) DOUG wae (ie > Wye ees oth tacts ae BLL) 
The equations for w and h, derived by curl operations on (5.10), are 

A [0H » A 
(a) —vV?2l=20(4.grad)w+ rie (+. grad) 


(b) —nV2h=H,(4. grad)w | 
(c) —xV20=—pw | (5.12) 
(d) —7V?6=H,(7 .grad)é 

(e) W4w=y[(k. grad)?—V2]0+2.Q(4.grad)¢— pig (7. grad)V2h 


47 po 
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with 
; = ; : : oo A . k 
(0) ero eae ox efile Unsieeieed Onan 
(6) €=curlvy : H=curlA 
These equations, given first by Chandrasekhar (1954), are the equations 


to be used in the treatment of onset of convection. They are to be solved 
subject to the boundary conditions for the fluid 


w=0=0, z2=0or l. i, ate ee ee eee 
On a rigid surface 
a =0 and {=0,- oe. oo 
or alternatively on a free surface 
02w ac 
—— —= a A Se ee 
02? ee dz ae ( ) 
and the boundary conditions for the field 
E, =H ,=0, or “By=0,. 5 Aa Oe 


(5.15) applying to a perfectly conducting surface, and (5.16) to a 
free surface facing a vacuum. From (5.12) equations for ) and w can 
be formed ; in order to illustrate the complexity of the analysis we write 
down the equation for h, which is typical of the whole set : 


2H 2 2 402 
v2 { (vs * 5, (+. grad)*) + te (a. grad) h 
0 Vv 


2 2 a 
=— e (vs bes (7. grad) ((é : grad)*—V") h 


Pov 


. (5.17) 


5.2.2. Solution Procedure 


In order to solve the equations it is necessary to specify clearly the 
type of convection that first sets in, viz. whether cellular or hexagonal. 
This is achieved by a suitable separation of the variables. For example, 
if k, d, and 7 are identical, so that g, Hy and Q act in the same direction: 
the functions w, 0, £, h and ¢ are separated according to 


w(x, y, 2)=f(v,y)W(z), O(a, y, z) =f (w, y)O(2), 
C(x, y, )=f(x, y)Z(2), h(x, y, 2)=f(w, y)X (2), - (5.18) 
P(x, Y, 2)=f(w, y)P(2) 
where, according to Pellew and Southwell (1940), for cellular convection 
f satisfies the equation 


nie eS xs 
satga)f=—of  . . ... . (6.19) 


Here a? is areal number characterizing the geometry of the cell. Physically, 
the separation amounts to an analysis of the initial disturbance into 
normal modes, each characterized by the wave number a/I (see e.g. Rayleigh 
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1916). The insertion of (5.17) into (5.12) gives equations for the unknown 
functions W, 0, Z, X and @: 


a? d 22_.dZ 1 
(a) le —hat): —18t) Al W— Foal ee ex = ita@ 
d* oe 22..da (a? 
0) (Eero) on] 288 (80) w 
A l* dw 
pata oo el Bg gt aa ae 
© (B 8) X de 
a Al? dZ 
@) (Fs — |? o*) @o— SARS Ie cae (5.20) 
d? d? 2 d2 72 
aaa f (ia) 
dz (a2 
+2 5a — la? )} W 
2 2 2 | 
=—n0| (% 2!) oe 2] | 
where 
og 4O2p 2H Mole 
ESE ES Saye Oa . (5.21) 


k, T and Q are the characteristic dimensionless numbers for the theory 
defining respectively the critical temperature gradient, angular velocity, 
or magnetic field for instability;. The values are likely, in a typical case, 
to be respectively of the order 10?, 10%, 10°. 

The boundary conditions (5.14) to (5.16) become : 


2 
(a) be —Tat) —PQ | W— ses ole = =0}e—0 or | 
d r f (5.22) 
J 


(bd) W=0 
together with 


dW 
aa ae a ae I 
on a rigid surface, or 
ew AZ 
ap pO oe a ERY) 


on a free surface, and 
CD) SEA), C0) Daa Oana er SOLD) 
respectively in place of (5.16). The number of boundary conditions is 
exactly equal to the order of the equations (ten in each case) so that 
the problem is uniquely soluble. 
The procedure in principle is first to assign for a given conducting liquid 
the value of a2 (the type of convection), Q ( the strength of the seed field Hy) 
and 7' (the conditions of rotation) ; the eqns. (5.20) are then used (not 


+ It need hardly be pointed out that 7 appearing in (5.20), and defined in 
(5.21) is quite distinct from the local temperature although the same symbol 
has been used in each case, 
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actually solved necessarily ) to determine the possible values of the Rayleigh 
number for the problem. The procedure is to be repeated for different 
values of the parameter a? and in this way a table is to be formed giving 
the dependence of the adverse heat gradient on the convection pattern. 
The lowest possible value of the Rayleigh number as function of a* is 
taken to be the critical Rayleigh number R, defining conditions of marginal 
stability and gives the adverse temperature gradient , at which convection 
first sets in for given Q and 7’. The calculation must then be repeated 
for a range of the characteristic numbers Y and 7’. 

The impracticability of acwually carrying through this procedure led 
Chandrasekhar to employ the equivalent and very elegant variational 
procedure developed by Pellew and Southwell (1940) for solving the 
corresponding problem without the field. The method is to arrange the 
equations governing stability so that the product aR is equated to a 
quotient of integrals involving the other variables, such as W, for given 
values of the parameters 7’ and Q. By employing a series of trial functions 
for W, the condition is imposed that R must have a stationary (minimum) 
value. It is found that the method is insensitive to the exact form of the 
trial function unless excessive accuracy (greater than, say, 1 part in 10*) 
is required. The minimum value of & obtained in this way is the critical 
Rayleigh number, &,. Pellew and Southwell in their paper point out 
that in the no-field problem the variational procedure leads to slightly 
high values of &,, and this would appear to remain true in the more general 
case, although the error is likely to beno greater than 10-2. Chandrasekhar 
developed his calculations in a number of papers and we consider this 
work now. 


5.2.3. Magnetic Field Alone 


. A non-rotating fluid layer in the presence of a magnetic field was the 
first problem treated by Chandrasekhar (1952 a), ef. also Thompson (1951). 
The equations for the onset of convection are obtained from (5.12) by 
equating {2 to zero. The boundary conditions do not now contain f, so 
that ¢ and consequently ¢ need not now appear in the equations. The 
magnetic field to be inserted is the component of the seed field along the 
direction of gravity; consequently there is no loss of generality in 
restricting the analysis to the case where g and Hy, are parallel. The 
equations for the problem are now (5.20 a and e) with 7’ equated to zero 
(2—0), along with the boundary conditions (5.22) to (5.24). The 
variational procedure already mentioned is then applied. 

Three distinct physical systems are to be recognized according to whether 
the boundary surfaces are both free, both rigid, or mixed; the latter 
conditions would appear the most appropriate in applications to atmospheres 
of the earth and the sun, Chandrasekhar gives solutions to the problem in 
numerical form. It appears that the magnetic field inhibits convection, 
the effect increasing in importance as the field strength rises. This 
conclusion applies for each of the three possible boundary conditions, 
being most marked for the free—free case ; it seems, however, that the 
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type of boundary condition used does not greatly affect the general results 
of the calculations. In the limit of large magnetic field strength it appears 
that R, is proportional to Q. Chandrasekhar’s results agree with those 
of Pellew and Southwell in the limit of vanishing field. 

In order to see the magnitude of the inhibiting effect of the field we 
quote some values given by Chandrasekhar (1952 a). If R, is the critical 
Rayleigh number without the field, then for mercury at room temperature 
(o=1-1x 10° ohm/em ; py=1-7 10-?) and with magnetic field strengths 
of 1-3 x 10? gauss and 4x 10° gauss (the values of the non-dimensional 
parameter @ being respectively 10% and 104), the values of the ratio R,/R, 
are respectively 23 and 182 for the free-free boundary conditions. For 
the rigid—rigid, and rigid-free boundary conditions, the corresponding 
values are 5-1 (Q=10%) and 318 (Q=104); and 10 (Q=10%) and 72-9 (Q=104). 
The inhibiting effect is seen to be quite marked even for field strengths of 
moderate value, so that laboratory tests, especially for the rigid—free 
boundary conditions, should be quite feasible. Very recently Nakagawa 
(1955) has in fact confirmed the predictions experimentally in a quantitative 
manner. 


5.2.4. Rotation Alone 


In another paper Chandrasekhar (1953 c) treated the case of the fluid 
layer rotating in the absence of a field. Although this problem is not 
strictly magneto-thermal it is included here as being an integral part of 
the complete chain of argument. The effect of the Coriolis force on thermal 
stability is derived from eqns. (5.12) with Hy (i.e. Q) equated to zero. 
No generality is lost by treating the case of the rotation axis parallel to 
gravity, the component of the angular velocity vector in this direction 
being important in the theory. The problem now is to determine R, in 
terms of a2 and 7’. Chandrasekhar obtained this information in numerical 
form using the variational method for the three types of boundary 
conditions. The results show that rotation has an inhibiting effect on 
thermal instability, though the effect is rather less marked than that 
of a magnetic field. Taking Chandrasekhar’s values, if Ry is the Rayleigh 
number for rotation without the field, the ratio R,/R, is a function of 7 ; 
for the free—free conditions R,/R, has the values 1-257, 2-549, 8-178 for 7’ 
respectively 10?, 10°, 10*; for the rigid-rigid boundary conditions the 
corresponding values of the ratio R,/Ry are 1-028, 1-260, 2-759; and for 
rigid—free conditions the values are 1-006, 1-062, 1-488. The effect of the 
boundary conditions becomes more marked as the rotation velocity is 
increased. | 

Chandrasekhar finds that as 7 oo (the limit of indefinitely fast rotation 
or vanishing kinematic viscosity), R,? becomes proportional to 7, giving 
the relation for the critical temperature gradient xgB,—constant ($2%?/v14). 
Tn the alternative limit of vanishing rotation the expression for the critical 
temperature gradient approaches the form gaf,—constant (x<v/4). It is 
important to notice that the dependence of Bo on v and on Lis different 
in these two limits, and this fact might well be important in applications 
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of the theory. In such applications the type of instability that first arises 
will depend upon the ratio «/vy. Chandrasekhar finds in his paper that, 
if both bounding surfaces are free, convective instability arises if c/v< 1-478. 
If this condition does not apply over-stability (ie. a disturbance of 
increasing amplitude) can arise first provided the ratio 2/v is sufficiently 
large. Finally it should be mentioned that Chandrasekhar demonstrates 
the importance of the effect of rotation on the behaviour of the atmospheres 
of the earth and of the sun for regions of height in excess respectively of 
10 metres and 104 metres. 

At present there does not appear to be available any experimental test 
of the rotation effect. Presumably the rigid—rigid boundary conditions 
would be technically the most convenient, but here the effect becomes 
suitably large only for 7’ in excess of 104. 


5.2.5. Magnetic Field and Rotation 

If the fluid layer is subjected to both a magnetic field and a Coriolis 
force the calculations become incredibly complicated, the full eqns. (5.20) 
to (5.25) now having to be used. Chandrasekhar (1954) has begun work 
on this problem, although so far only a preliminary survey can be claimed. 
Chandrasekhar restricted his discussion to the case where g, H, and Q are 
coplanar ; this does now represent a restriction on the theory, since it is 
no longer the case that only the components of Hy) and Q in the direction 
of g are important. The results obtained by applying the variational 
procedure already show some completely unexpected features. It appears, 
for example, that now two distinct forms of convective cell can be present 
simultaneously, one being more elongated than the other, provided the 
variables 7’ and @ have suitable values. The interesting case of g, H, 
and 92 non-coplanar has not yet been treated. In fact it is clear that 
this general case is extremely complex and that even the extensive 
numerical work reported by Chandrasekhar in his paper leaves the 
problem far from its final solution. 

Of interest from a geophysical and an astrophysical point of view is 
the behaviour of a fluid sphere or a series of concentric shells of various 
properties, with a chosen distribution of internal heat sources, rotating 
in the presence of a magnetic field, say a fixed dipole field with 
its origin within the fluid. (Care must be taken in choosing the field 
inside the sphere.) The importance of this problem has already been 
recognized by Chandrasekhar (1952 b, 1953 b, 1953 d) who has begun 
a treatment of the problem. This as yet has been restricted to a 
non-rotating heated sphere, or series of concentric shells, without a 
magnetic field, and so does not include magnetothermal effects. The 
work does, however, represent the first steps of the full theory, but is 
outside the scope of the present article. It is seen, then, that the theory 
of magnetothermal effects has provided important results ; Chandrasekhar 
has made this part of magnetohydrodynamics perhaps the most fully 
treated at the present time, 
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§ 6. STEADY CONDITIONS 


Conditions of steady state, either equilibrium under magnetic and 
hydrodynamic forces, or alternatively steady flow, are of importance 
for their inherent simplicity. In the present section we consider these 
cases. 


6.1. Equilibrium : Magnetohydrostatics 


It is well known in conventional hydrodynamics that for a fluid to be 
in equilibrium in an external force field, the force must be derivable from 
a potential. The same is true in magnetohydrodynamics. For equilibrium 
therefore, (2.6 a) shows that, neglecting dissipative effects: 


Fz+ (u/47p)[curl H x H]=grad po fee (O-L) 


where 7% is a scalar function of position. If, further, F, is a conservative 
force with potential %,, and 4.=s+7, : 


(2/4 p)(curlL Hx A\=gradiyg; 9... . - . (6.2 


The eqns. (6.1) and (6.2) are necessary, though not usually sufficient, con- 
ditions for equilibrium. Lundquist (1950) has considered eqn. (6.2) and dis- 
tinguished between force-free} and pressure-balanced fields, distinguished 
respectively by the conditions grad .—0, or grad .=47 grad p40( p is 
here the hydrostatic pressure). 

For force-free fields a solution of (6.2) is 


curl H=(a. H) foe eae (O35) 


where « can be either a scalar, or any space function such that grad (« .H)—0. 
The physical interest in this solution is associated with the possibility of 
arranging a force-free current distribution within a conductor so as to 
cause a magnetic field outside the conductor. It would appear that such 
a mechanism is not impossible, although the region cannot be simply 
connected and « must be constant (cf., for example, Lust and Schliiter 
1954, Dungey 1953, Layzer, Krook and Menzel 1955). 
Equation (6.2) can be transformed to become 


(H.. gradJH=grad fs, fy=yet(u/8mp\(HH). —. (6.4) 


For force-free fields (grad s,=0) there exists only the trivial solution 
H—constant. For the case grad 4,40, from (6.4) and (2.6@), for an 
incompressible fluid : 


1 
se = We gradyH—grad {°( p+ p+ e (H. H)) (6.5) 
4p p 7 
If H=0, (6.5) becomes the equilibrium condition of conventional hydro- 


dynamics. atau ; 
Equation (6.5) shows that the condition of equilibrium is in this case 
controlled not only by the field, but by the pressure. In terms of the 


Rava BP. 
+ Note added in proof.-— Recently Chandrasekhar has considered the problem 
of force-free magnetic fields further (1956, Proc, Nat, Acad, Sciences, 42. 1. 
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disturbance field h, using (2.9) and (6.5), the condition for equilibrium is, 
for H, in the z-direction, 


For small disturbances it follows that h is confined entirely to the (x, y) 
plane ; in consequence for equilibrium, electric currents in the fluid must 
be along the direction of the seed field. 


6.2. Steady Flow 


For steady flow the conditions dv/dt=0, dh/dt=0 apply. For an 
incompressible fluid the eqns. (2.6) become : 


pele pees v2 
(a) (v. grad)v=F,+ ie [curl H x H] 3 grad p+vV2v (6.6) 


(b) (v.grad)h=(H . grad)v+AV?H 


Using (2.9), these equations can be expressed in terms of the disturbance 
vectors v and h. If H, is homogeneous, and if either the fluid motion is 
small, or else if (2.11) is valid, (6.6) become, for conservative hydrodynamic 
force : 

(a) Tap (Ho . grad)h-+vV2v= grad {~ (» we) +41} 


877 (6.7) 


(b) (Hy . grad)v+ AV*2h=0 


Using (2.13), the right-hand side of (6.7 a) is zero. The two eqns. (6.7) 
can then be arranged into two equations involving v and h separately. 
The equation for v is, if Hy is in the z-direction : 


0?v vA 
a Va OS dae a eee Ae 


where Vp is given by (2.19). Equation (6.8) was first given by Lehnert 
(1951), who used it in an experimental study of the velocity profiles of 
steady laminar flow between two plane surfaces. 


Starting again from (6.7), Vy and v are readily found to be related 
according to 


(V,. grad)v=0 ; Ae rr ee ed (5) 


(6.9) shows that there is no velocity component in the direction of the 
applied field ; the velocity lies entirely in the plane perpendicular to the 
seed field. An analogous effect is found in rotational hydrodynamic 
motion, where the motion is perpendicular to the rotation axis. This 
agrees with the general conclusion that the effect of a magnetic field is 


in many ways similar to the effect of rotation, a result inferred experimen - 
tally by Lehnert (1951). 
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§ 7. Practica, DEMONSTRATION OF THE THEORY 


The existence of transverse wave fields (magnetohydrodynamic waves) 
resulting from a coupling of hydrodynamic and electromagnetic phenomena 
was suggested by Alfvén on theoretical grounds. It is of interest to 
consider for a moment the experimental justification of the theory. The 
coupling is strong if the factor H,Lo//p is large (Lundquist 1952, Bullard 
1955), L here being a characteristic length. Observable consequences 
occur only for strong magnetic fields (of the order 104 gauss) acting over 
fairly large volumes ; in addition the electrical conductivity must be 
high, and the density satisfactorily low. These conditions can be achieved 
only very approximately in the laboratory and for this reason at the present 
time there exists only a magre, largely semiquantitative experimental 
background to the theory. This unsatisfactory situation reacts unfavour- 
ably on the further development of the theory. Certainly astrophysical 
conditions can be expected to be intrinsically more favourable, but the 
effects of interest must now be expected to be masked by unrelated, or only 
loosely related, phenomena. 


7.1. Laboratory Work 


The first laboratory demonstration of a coupling between electromagnetic 
and hydrodynamic phenomena appears to be that reported by Hartmann 
in 1937. As a by-product of other work, Hartmann (1937) and Hartmann 
and Lazarus (1937) found that a transverse magnetic field of strength 
104 gauss affects the flow of mercury in a pipe; further the onset of 
turbulent flow was markedly inhibited. They explained the several 
results in terms of a flattening of the velocity field profile, thus reducing 
the effective fluid velocity. They also predicted on dimensional arguments 
a relation between the pressure gradient and field strength in mercury 
which appears to have experimental support. For initial laminar flow it 
was found that the field increases the pressure gradient; for initial 
turbulent flow the field at first decreases the pressure gradient up to a 
critical fluid strength, after which the effect becomes positive. These 
results were later confirmed and extended by Lehnert (1951) who found 
in addition that the flow transition velocity is proportional to the magnetic 
field strength. More recently Murgatroyd (1953) has made measurements 
on the channel flow of mercury in the presence of a transverse homogeneous 
magnetic field. It would appear from these experiments that laminar flow 
can be maintained even for Reynolds numbers as high as 10°, ifthe magnetic 
field is sufficiently large. A start was also reported in this paper in 
determining the factors affecting the transition from laminar to turbulent 
flow. These preliminary results appear to confirm the prediction of 
Lundquist (1952) that the transition occurs at a specific value of the 
ratio R/M (R is Reynolds number; M is the dimensionless quantity 
wHl(o/pv)#/2 where / is a characteristic length), and for mereury this ratio 
is of the order of 103, Lehnert (1952) has considered this transition for 
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mercury contained between two non-conducting, insulating, coaxial 
cylinders, with a magnetic field of the order of 104* gauss directed along 
the axis. He found that the torque between the cylinders for a given 
relative angular velocity increases with the field strength. Although, as 
Elsasser (1955) has recently pointed out, it may not be easy to give a simple 
interpretation of these results, they do describe a general coupling of the 
type considered by the theory. Indeed from all the initial experimental 
work it is clear that a magnetic-hydrodynamic coupling does exist, and 
that, as suggested by the theory, the coupling has usually a very complex 
form. 

The great practical difficulties of obtainmg a homogeneous constant 
magnetic field of strength 104 gauss, and extending over a volume greater 
than that enclosed by a two-inch cube has made a laboratory recognition 
and study of magnetohydrodynamic waves very difficult. Such a demon- 
stration was however first made by Lundquist (1949). In these experiments 
a stainless steel cylindrical container was filled with mercury and was 
immersed in a magnetic field of 104 gauss, directed along the axis of the 
vessel. A paddle arrangement at the bottom of the vessel allowed a 
disturbance of chosen period to be set up in the mercury ; it passed along 
the field lines of force to the surface of the mercury, its presence there 
being indicated by the reflection of a light beam by a floating mirror. 
By timing the disturbance from the bottom to the top of the container, 
the velocity of the disturbance was found ; by comparing the amplitudes 
of the disturbance at the top and bottom, the damping was determined. 
Lundquist in his paper made a prediction of the expected values. The 
experimental results, while not in exact agreement, were not incompatible 
with the calculations, the experimental accuracy not being high. 

Lehnert (1954) has improved on these techniques, and has repeated the 
experiment using liquid sodium. The smaller density of sodium compared 
with mercury, together with its favourable electrical conductivity, makes 
it more suitable for the purpose than mercury, a decision maintained in 
spite of the great technical difficulties associated with its use. Lehnert 
has obtained fairly satisfactory agreement between theory and experiment 
and it can be concluded that the existence of Alfvén waves has now been 
demonstrated in the laboratory. 


7.2. Astrophysical Observations 

Astrophysical conditions are characterized by immense size as compared 
with the laboratory and with magnetic fields varying in magnitude from 
some 10°® gauss in spiral arms (Chandrasekhar and Fermi 1953) or possibly 
in the interplanetary regions (Beiser 1955), to some thousands of gauss 
in certain magnetic stars (cf., for example, Babcock and Burd 1952). 
Further, a very large amount of the physical material is ionized. Even 
when the magnetic fields are small the physical dimensions involved are 
so large that the coupling between the magnetic fields and hydrodynamic 
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motion can be regarded as being the same as that associated with infinite 
electrical conductivity. This introduces a simplification into the theory. 
It is quite beyond the scope of the present article to consider the astro- 
physical evidence for the existence of magnetohydrodynamic disturbances 
in more than cursory fashion. The work must be mentioned, however, 
because it seems likely that the most complete demonstration of the 
theory will ultimately be found under astrophysical conditions. 

Alfvén first invoked magnetohydrodynamic waves in astrophysics in an 
attempt to understand sun-spot phenomena and the theory was later 
extended by Walén (1944). (For details of the theory see Alfvén 1950.) 
_ The seed field is taken to be a general solar magnetic field ; the existence 
of such a field, while not unlikely (Cowling 1945) is by no means proven 
(von Kliiber 1954, Babcock 1955). According to the theory, turbulent 
disturbances in the central core, which it appears necessary to treat as 
being contained in two independent regions, give rise to torridal whirls. 
These move out along the lines of magnetic force. Moving parallel to the 
field they pass to the surface producing a bipolar spot when suffering 
reflection at the (rigid) solar surface ; moving antiparallel to the field 
they excite the other turbulent core region to emit such whirls. By 
suitably positioning the two turbulent regions account can be made of 
the ll-year number density and 22-year magnetic sun-spot periods. 
Further, by assuming the general solar field to be dipolar outside the core 
region a migration of spot activity towards the equator of the observed 
form can be obtained. The theory cannot, unfortunately, withstand 
detailed analysis in its present form. The most unsatisfactory feature 
of the theory is that it accounts for each property by invoking an explicit 
assumption ; in particular the existence of two central activity regions 
is of itself unconvincing, and the explicit dependence on a general magnetic 
dipole field lacks observational authority. The theory is also extremely 
restricted in the range of data that it treats. Although it is not to be 
expected that so complex a phenomenon as that of sun-spots can be treated 
by a simple theory it is to be required of any genuine theory that it uses 
a minimum of axioms. The present theory treats only bipolar spots 
and needs further drastic modifications in order to cover also unipolar 
and invisible spots, and sun-spot groups ; it does not give account of the 
Evershed effect, of the observed general spectroscopic data; while 
including a latitude effect it says nothing, without further assumptions, 
about the observed longitudinal asymmetry ; and, as Ferraro (1954) has 
pointed out, the rigid boundary condition invoked must very probably be 
replaced by a free surface condition in any final theory. Further, Cowling 
(1945) has drawn attention to the powerful destabilizing effects of a 
gravitational field on the whirl rings, and. this objection. is very real. 
It must, in consequence, be admitted that although it might be expected 
on theoretical grounds that magnetohydrodynamic waives might be 
‘associated with sun-spots, there are as yet no definite data that can be 
satisfactorily interpreted in such terms. 
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Other exploratory work might well be more valuable from the present 
point of view. Alfvén (cf. 1950, p. 151) has suggested that coronal heating 
may be understood in terms of the ohmic attenuation of magnetohydro- 
dynamic waves. This approach may well be of value, particularly in 
describing local heating, although it appears insufficient to supply the 
extreme temperature (~108 deg. K) observed. Again Dungey (1953) has 
invoked magnetohydrostatics in a preliminary discussion of prominences. 
These attempts appear encouraging and suggest that ultimately many of 
the solar features will be understood in terms of a coupling between 
hydrodynamics and electromagnetism. Of course, the origin of the solar 
magnetic fields is still far from clear and this must form an important — 
break in the argument. Any full interpretation of astrophysical data in 
terms of magnetohydrodynamics must, therefore, involve some dynamo 
theory, but as yet no satisfactory dynamo theory has been advanced. 

The existence of magnetohydrodynamic waves has recently been inferred 
by Thompson (1955) from cosmic-ray physics. This author has suggested 
how difficulties concerned with the acceleration time of cosmic-ray particles 
in the theory of Fermi (1949) can be overcome if the accelerating mechanism 
involves magnetohydrodynamic waves. These are supposed to arise in 
ionized gas clouds moving in the general galactic magnetic field. Although 
not uniquely establishing the existence of magnetohydrodynamic waves, 
Thompson’s work does show that such waves have an empirical utility. 

In conclusion, then, it seems that even in the favourable conditions met 
with in astrophysics the definite existence of magnetohydrodynamic 
waves has not been undeniably demonstrated, although the importance 
of the coupling between electromagnetic and hydrodynamic fields is clear. 
Certainly the theory would benefit greatly from more definite experimental 
and observational results, and the difficulty met with in laboratory work 
makes it reasonable to conjecture that it is from observation that the best 
chance of help will come. One cause of the difficulty may be that in 
astrophysical problems the fluid velocity and amplitude are large, so that 
shock wave theory must be invoked. The remainder of this article is 
devoted to the treatment of magnetohydrodynamic shock waves. 


Part Il. Suock DistuRBANCES 
§ 8. FratruREs oF THE THEORY 

Up to the present the arguments have involved only small velocity 
disturbances ; this restriction is now dropped, the arguments also applying 
to large velocity (i.e. shock) disturbances. This more general case is likely 
to have both astrophysical and technological importance. 

Shock disturbances in the absence of electromagnetic fields have been 
well explored both experimentally and theoretically in conventional 
hydrodynamics (see e.g. Courant and Friedrichs 1948). A considerable 
body of information is, therefore, available to act both as a general guide, 
and also as a limit, for the magnetohydrodynamic theory. The equa ions 
of motion for the theory are the well-known conservation relations of 
Rankine—Hugoniot or, for large velocities, the relativistic analogues (‘Taub 
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1948). The equations are non-linear and this form has important physical 
consequences. In particular, it represents a coupling between microscopic 
and macroscopic properties, allowing a limited amount of macroscopic 
order to come about at the expense of a considerable amount of microscopic 
order. In physical terms this transfer of order is recognized macroscopically 
by sharp boundaries in the phenomena and by the associated high 
temperatures, respectively. The presence of a magnetic field is found 
to increase the efficiency of this coupling so that, for given conditions of 
flow, the temperature is rather lower when the field is present than when 
itisnot. The effect, however, is not critical. An immediate consequence 
of the high temperature is the continually changing background conditions 
for any extended shock disturbance. 

It is possible to satisfy the requirements of mass, momentum or energy 
conservation for conditions of high fluid velocity only if virtual dis- 
continuities in the medium are allowed. It is well known that, in the 
absence of a magnetic field, purely hydrodynamical considerations allow 
for both expansion and compression shocks. The requirements of the 
second law of thermodynamics, however, enable only compression shocks 
to have physical stability. The same arguments can be applied to the 
magnetohydrodynamic case, so that only compression shocks need be 
considered. The increase of the velocity of sound due to the presence 
of a magnetic field (provided the field is not directed along the propagation 
direction, when there is no effect) causes the shock to build up less easily, 
so that for a given fluid velocity, the magnetohydrodynamic shock tends 
to be less strong than the corresponding hydrodynamic shock. 

In the usual treatments of shock propagation it is not found necessary 
to take account of the dissipative forces and this seems also true in the 
presence of a magnetic field, provided the electrical conductivity is not 
too low. Accordingly it is generally permissible to equate both v and A 
to zero. This approximation does not apply if the shock profile or structure 
are of interest, since the dissipative terms (particularly viscosity and 
thermal conduction) are now controlling factors. Any full treatment of 
shock profiles and internal conditions must ultimately involve an application 
of the kinetic theory of non-equilibria, since it is known that, in the case of 
agas, the width of a shock is of the order of a few mean free paths (Rayleigh 
1910). The difficulties associated with such an application of the theory, 
particularly for dense matter, are particularly severe and only the first 
steps seem so far to have been taken (e.g. Mott-Smith 1951). Progress 
has, however, been made by applying the continuum theory, and although 
some objections can be raised to this procedure the conclusions to which 
it leads seem consistent with those resulting from the kinetic theory 
arguments. This agreement may re-enforce the present inaccuracy of 
both methods, but is more likely to support the view that the continuum 
theory is sufficient to yield at least reliable orders of magnitude. oa 

The conventional hydrodynamic theory leads to the formal recognition 
of both longitudinal and transverse shocks, where the material is moving 
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respectively parallel and perpendicular to the propagation direction. 
The theory leads to a finite, non-zero velocity for the longitudinal shock ; 
the transverse shock, however, must be assigned zero velocity and is 
usually referred to as the slip stream. Fora conducting fluid in the presence 
of a magnetic field many of the conventional features are still present. 
Again both longitudinal and transverse shocks can be recognized, but now 
the distinction is more than a formal one. The transverse shock also has 
now a finite, non-zero velocity, and so has physical interest. In the small 
amplitude limit the transverse shock is recognized as an Alfvén wave. 

The theory of magnetohydrodynamic shocks is extremely complicated 
and progress so far has been made only for plane waves. The first systematic 
treatment, including relativistic effects, was made by Hoffmann and Teller 
(1950) and this work was later clarified in some respects by Helfer (1953). 
Recently the structure of magnetohydrodynamic shocks has formed the 
subject of some interesting and important work by Marshall (1955) using 
for simplicity a completely ionized gas. In the next two sections these 
several theories are considered in detail. 


§ 9. SHOCK PROPAGATION 


The propagation of magnetohydrodynamic shocks in an infinitely 
conducting fluid has been treated by Hoffmann and Teller (1950) and 
Helfer (1953). It is convenient to treat the longitudinal and transverse 
shocks separately. The method of procedure is to choose two coordinate 
frames of reference (inertial frames in the relativistic case) : one is chosen 
so that fluid material is at rest either ahead or behind the shock, the shock 
moving with a velocity v ; and the other is chosen to move with the shock 
front, the shock now being steady. Denoting these frames respectively as 
the S- and S’-frame, relations for conservation and continuity are derived 
first in the S-frame and then transformed, by a Lorentz transformation, 
to the S’-frame. 


9.1. Longitudinal Shocks 


A longitudinal shock is one in which the material motion is parallel to 
the propagation direction. The magnetic field can have components 


parallel and perpendicular to this direction, leading to two types of 
longitudinal shocks. 


9.1.1. Parallel Field 

The magnetic field is now parallel to the propagation direction and so 
also to the fluid motion. A coupling between the hydrodynamic and 
magnetic fields cannot now exist, since the vector product of v and H 
is zero. Longitudinal parallel shocks are, therefore, unaffected by the 
presence of a magnetic field. The correctness of this simple argument is 
confirmed by the detailed analysis of Hoffmann and Teller. The motion 
of the shock is completely specified by the conventional Rankine—Hugoniot 
conservation equations, or their relativistic analogue, involving the 


Some Aspects of Magnetohydrodynamics 487 


relative density, or pressure, change across the shock. If the subscripts 1 
and 2 refer respectively to material ahead of, and behind, the shock the 
relativistic Rankine—Hugoniot equations are : 


(a) By?(€y01?+- 1) =B o?(€ 2027+ po) 
(6) Bi? (ert 0) =P? (caret “#p2) ) Peder hak) 


(c) ByNqV4~=B aN do J 
where 


< is here the usual relativistic energy density, and n the particle number 
density. In the non-relativistic limit (e>nmj=p; mo being the rest 
mass) eqns. (9.1) become the conventional classical equations : 


(a) pl? +P1=pwv.? +o | 
(b) Ply +01 +4 pV F= pallet pwettpove? . (2.2) 
(c) P1U1=P Wve J 


U is here the internal energy of the fluid. The equations are determinate 
if either of the ratios p/p, or p,/p, are given. The magnetic field strength 
does not occur in these equations, which therefore apply to both insulating 
and conducting fluids. 


9.1.2. Perpendicular Field 

Alternatively, the magnetic field may be perpendicular to the propaga- 
tion direction, and so also perpendicular to the material motion. The 
magnetic-hydrodynamic coupling will now be important. By analogy 
with the earlier discussion it is to be expected that the field effect will in 
the non-relativistic limit augment the hydrostatic pressure and the 
internal energy by amounts depending on the square of the field strength. 
Detailed calculation shows this to be the case. For propagation along the 
x-axis of a Cartesian frame, and magnetic field along the y-axis, Hoffmann 
and Teller give the equations 
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Further, 
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go that there is no possibility of cancellation of field terms in (9.3). 
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In the non-relativistic limit eqns. (9.3) become : 


Ho? Hoy 7 
(a) ply t+Pit ot =p Vo" +Pet aos 
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— Pave (tl. =) +V2 (vet <=) BP 22° 
(c) pi =pave 
By defining the new variables p,* and U* according to 
pape Se 5 UUs Ee, (9.6) 


the eqns. (9.5) become identical with the conventional Rankine-Hugoniot 
eqns. (9.2). These equations now take account of the magnetic field 
contributions to the pressure (through the magnetic pressure H,?/87) 
and to the internal energy (through the magnetic energy H,?/87p). 
The eqns. (9.5) become determinate in the strength of the shock, and the 
strength of the magnetic field is given. The magnetic field on each side 
of the shock are related according to 


Hf 1 ¥17=H ayo. 


A solution of (9.5) has the familiar form 
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The shock disturbance described by (9.5) and (9.7) approaches the form 
of a longitudinal sound wave in limit of small disturbances. In this case 
each side of eqns. (9.5) differs from the other by a small quantity only. 
Consequently the equations can.be expressed in differential form under 
the condition that dy=v,—v, is infinitesimal. Equations (9.5) are in this 
case equivalent to the two equations 


(9.7) 


dp dp H.2 } 
nae 1 Mey el ly P| 0 7] 
(a) dv =py¥-+v oii) sae pa 


Ally eee] 
(b) pov-+o2he =0 
F at 


— 


Some Aspects of Magnetohydrodynamics 489. 


po is here the mean density, and v the velocity of the small disturbance. 
Equation (9.8) may be solved for v : 


dp H,?\1/2 


As H,?+0, v-(dp/dp,)?; alternatively as H p>, v>H?/(4rp,)*. 
The first case corresponds to the normal sound wave; the second case 
refers to the retarded sound wave. This second case must be clearly 
distinguished from an Alfvén wave ; although it has formally the same 
velocity it is a longitudinal wave whereas the Alfvén wave is transverse. 

In situations of both astrophysical and laboratory interest it is sufficient 
to treat the fluid as an ideal gas ; the treatment of such a gas also involves 
only simple mathematics. For a relativistic ideal gas (for which k7'> myc? 
and p?c?>mp"c*) the pressure and energy density are known to be related 
by 

ec?= 3p. 

In the limit of a small disturbance the eqns. (9.3) can be replaced by the 
pair (Hoffmann and Teller 1950) : 
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From (9.10) and (9.11) it follows that v, « and H, are related according to 
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For the case H,=0, v=c/1/3. For the case H,> oo it follows from (9.12) 
that vc, i.e. as the energy density of matter becomes vanishingly small, 
the shock propagation velocity approaches the velocity of light. Hoffmann 
and Teller have shown that these results satisfy the requirements of 
stability and positive entropy change. 

In the non-relativistic case, where U—p/p(y—1)—1, Helfer (1953) has 
shown that the magnetohydrodynamic field has properties analogous to 
those met with in conventional hydrodynamics, provided the distinction 
of variables (9.6) is used. Of particular interest is the effect of the magnetic 
field on the shock strength. The physical condition H>0 implies the 
inequality (Helfer 1953) 


Picmeated < [ae | 
pi PLAY ae) 
which gives the conventional form if the variable p* is introduced instead 
of p. It follows that the effect of the introduction of a magnetic field is 
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to reduce the shock strength, and also to reduce the associated temperature. 
As Hoffmann and Teller pointed out, the effect of the field is not to change 
the mechanism of energy dissipation, this still remains the conversion of 
kinetic energy into heat. 


9.2. Oblique Shocks 


A transverse shock is one in which the material motion is in the plane 
of the physical discontinuity. In the most general case, material will 
have velocity components both parallel and perpendicular to this plane ; 
such a general disturbance is called an oblique shock. It is mathematically 
more convenient to treat this general case than the pure transverse 
conditions, since in the latter case there is difficulty in a simple choice 
of the two reference frames. The treatment of oblique shocks provides 
many very difficult problems, made the more difficult because the corre- 
sponding no-field case does not arise. The central difficulty arises from 
a lack of uniqueness associated with the ability of the shock to convert 
magnetic energy into either kinetic energy or potential energy. The way 
in which the fluid leaves the shock front is therefore not unambiguous. 
Three parameters are now necessary to determine the problem, viz. the 
shock strength, the field strength, and angle between the lines of magnetic 
force and the shock front. 

By applying the well-known theory, Hoffmann and Teller derive the 
components of the energy momentum tensor appropriate to an oblique 
shock whose front is perpendicular to the z-axis of the S-frame, the 
magnetic field H, lying entirely in the (wy) plane (H=H,,?+H,,?; 
H,,=0). The S’-frame is derived from the S-frame by a velocity trans- 
formation parallel to the magnetic lines of force. Such a choice of axes 
introduces the simplification into the theory that all associated electric 
field components vanish. If 7',, is the energy-momentum tensor 
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these equations applying either behind or ahead of the shock. The 
equations of motion follow from the conservation conditions expressed 
by the vanishing of the four-divergence of the energy momentum tensor. 
The motion has been assumed not to extend to the (wz) plane, so that 
we require 7',,,=0 and 7',,.=0. The most natural way of satisfying 
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this requirement is to assume both v, and H, to be zero. There is some 
ambiguity here, since such a condition is possible irrespective of the value 
of the (ay) component of stress, whether it be finite or zero. These two 
cases lead to somewhat different results. In each case the theory is 
extremely complex, especially so in its relativistic form. In consequence 
the non-relativistic limit alone has been developed in any serious way. 

In the non-relativistic limit Hoffmann and Teller find that the full 
eqns. (9.13) lead to the shock equations of the Rankine—Hugoniot form 
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which applies on each side of the shock. A complete solution of these 
equations can be found if the shock strength, the magnetic field strength 
and the obliquity of the field are all known. Helfer (1953) has considered 
the various solutions of these equations in some detail for an ideal gas, and 
has‘given a number of graphs relating the ratio H?/p to the shock strength, 
for the initial obliquity angles 6°, 174°, 45°, 724° and 84°. He also 
considered these solutions in several limiting cases. When the magnetic 
field is small it appears that eqns. (9.14) lead to conventional hydrodynamic 
shock equations, as, indeed, must be the case. This suggests that in at 
least some astrophysical applications of the theory it is sufficient in general 
discussion to treat shock strength and propagation in terms of the 
conventional theory. One most important result of the theory in the 
present limit is that concerning the magnification of small magnetic fields 
(of strength such that H?/87 <p) by a shock disturbance. This mechanism 
would seem to be of extreme astrophysical importance, in that it provides 
a natural amplification procedure for the small fields that arise spon- 
taneously in turbulent media (Batchelor 1950, Schliiter and Biermann 
1950). This mechanism, however, seems more likely to be of value in 
explaining the suspected galactic background field of 10~° gauss than in 
the explanation of the origin of larger cosmic fields. Alternatively, when 
the magnetic field is high (H?/87~10?p) the field magnification effect is 
not present. Now there is a proportionality between H,°/p, and (Po—p 1) |Py 
for the larger obliquity angles, and stronger shocks. The intermediate 
region of magnetic field strength arises from the weak field case by amplifi- 
cation ; as the shock strength is attenuated the magnetic field rises until 
it reaches an ‘ equilibrium ’ value such that H?~87p, further amplification 
not being possible. This relation would appear to have some astrophysical 


support, 
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In the limit of weak shocks, eqns. (9.14) can be put into a differential 
form expressing the small differences between conditions on each side of 


the disturbance 
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These equations are quadratic in the velocity and so provide two possible 
velocities for a given physical situation applying to longitudinal or 
transverse waves. A further duality is introduced by the possibility 
of disturbance conditions being either those of parallel or perpendicular 
field. Helfer (1953) has considered these various solutions in detail, and 
has found that they correspond exactly with those given by van de Hulst 
explicitly for small, low velocity disturbances (cf. § 4). Previously, 
Hoffmann and Teller (1950) had given the propagation velocities for the 
two cases H,>0 (perpendicular field) and H,—0 (parallel field). In the 
first case these authors find a longitudinal velocity 

SES 

Lape Sap 
which shows the effect of the transverse field on sound propagation : ‘the 
corresponding transverse velocity, V.,, is zero as in conventional 
hydrodynamics. 

In the second case, the perpendicular oblique shock, the longitudinal 
velocity is the unaffected sound velocity ; the transverse velocity has the 
form H ,?/47p and is to be identified as the Alfvén velocity. This treatment 
brings out clearly the distinction between the strictly Alfvén transverse 
wave and the modified longitudinal sound wave. Although formally 
similar in the magnetic velocity effect, they are to be clearly distinguished 
in applications of the theory (cf., for example, Anderson 1953). 


§ 10. SHock STRUCTURE 

In considering the propagation of shock waves it is permissible to treat 
the disturbance as being a plane of discontinuity, but in practice the shock 
shows structure. Conservation requirements in conjunction with dissipative 
processes are not compatible with the existence of mathematical dis- 
continuities in the fluid. The parameters specifying the fluid flow may, 
however, change appreciably over extremely small distances, so that the 
corresponding space gradient may be extremely large although finite. 
It is of interest to consider shock profiles, particularly because such 
information may have practical importance. Very recently, Marshall 
(1955) has reported some extremely interesting work in this connection 
and has arrived at some rather unexpected results. Although the con- 
tinuum theory is invoked in this work and not the full kinetic theory, he 
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offers reasons to support the belief that the results are nevertheless reliable, 
certainly for both form and order of magnitude. Marshall restricts his 
arguments to a completely ionized gas carrying a plane shock of infinite 
extent, and in this way escapes the need to include inessential complications 
such as photochemical effects and boundary distortions. In the present 
section we outline briefly the main results derived by Marshall. His 
analysis is restricted to the case of a longitudinal shock. 

When there is no magnetic field present the shock profile in an ionized gas 
differs from that in an unionized gas. This change arises from the relative 
importance of electrons and ions in the transport of energy or momentum ; 
for a neutral gas both energy and momentum are transported by a single 
species whereas in an ionized gas the electrons predominate in the transport 
of energy and the ions predominate in the transport of momentum. 
Consequently the length over which temperature changes in the shock 
is different to that over which the velocity changes. Further, the natural 
unit for the problem of an ionized gas appears from the theory to be greater 
than that appropriate to a neutral gas by a factor about ten, the factor 
being proportional to the thermal conduction in the stationary gas. 

The shock profile depends upon the shock strength. For small shock 
strengths the profile is smooth and extends over a distance of about 
one hundred mean free paths; the temperature changes smoothly over 
essentially the same region as the velocity. For strong shocks the profile 
is distinctly different. First the region of transition appears to be about 
twice as long as in the other limit, the velocity and temperature changes 
occurring again in the same region. Whereas the temperature profile 
shows a steady increase, the velocity profile shows a separation of ions. 
At first it decreases slowly to about half strength, but about three-quarters 
of the way across it decreases sharply to its final value. The transition 
of the form of the profile from one extreme to another would appear to 
take place without discontinuity, although the shock strength 


n=(3y—1)/(y+1) 


has claims to be considered the critical strength. 

The introduction of a transverse magnetic field considerably complicates 
the theory, and Marshall has considered only the two cases, small, or 
alternatively high, electrical conductivity. As the electrical conductivity 
decreases to zero, the solution must approach that of the no-field case 
already outlined. As the electrical conductivity increases the hydro- 
dynamic—magnetic coupling will increase and new effects can be expected. 
The physical situation depends now on both the shock strength and 
magnetic field strength. For high electrical conductivity the field strength 
is important behind the shock. Ifthisis small the shock profile is essentially 
the same as in the no-field case. Alternatively, if the field is large two 
regions of the shock profile can be distinguished. In the first region, 
which extends over some twenty or thirty mean free paths, there is a 
slow decrease of fluid velocity, accompanied by a slow increase of tempera- 
ture and field, The second, adjoining region, is characterized by a sharp 
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decrease of velocity and simultaneous increase of field, and almost constant 
temperature. This profile is likely to be of significance in astrophysical 
problems, in regions of low pressure. 

If the electrical conductivity is low Marshall suggests that a rather 
unexpected effect appears. Three distinct shock profiles emerge from the 
theory, the distinguishing parameters being shock strength and initial field 
strength. First for’ extremely strong shocks (7>3) and for any field 
strength from essentially zero upwards, the shock profile shows two 
regions ; at first there appears a rather wide region in which the field, 
the fluid velocity and the temperature each change fairly slowly. The 
width of this region seems to be proportional to the mean free path in 
the undisturbed gas and inversely proportional to both the electrical 
conductivity and the kinematic viscosity. Behind this is a second rather 
extended region in which the magnetic field changes very little whilst the 
velocity and temperature change rapidly ; the width of this region is 
controlled by the thermal conduction. And finally, there is a narrow 
region of the order of a mean free path of rapidly changing velocity and 
virtually constant temperature and magnetic field. Secondly, for a 
certain range of shock strengths which satisfy the condition 


V Ply <Uug< Ras 


applying behind the shock, a profile of slightly different form emerges. 
Two regions can be recognized being essentially the first two regions of 
the strong shock case. The third region of length of the order of the mean 
free path which contains only slowly varying variables is not present ; 
the total disturbance has an overall length essentially the same as that of 
the strong shock, i.e. several thousand times the mean free path in the 
undisturbed gas. The most interesting feature of both these types of 
shock profile is the change of magnetic field ahead of the region of sharp 
velocity change. Marshall (1955 b) has quoted experimental support of 
Atkinson, Fowler and Holden (1954) for this feature of the magnetic field, 
although he admits that no quantitative test of his theory seems yet 
possible on this basis. Instead of the magnetic field adjusting itself behind 
the shock, as might have been expected, it appears to perform much of 
its change ahead, and so control the shock in this way ; the temperature 
changes ahead of the shock, with the field, although its major change 
occurs still in the velocity shock region. Shock profiles of this type occur 
for all shock strengths above a critical strength which is fixed by the 
field strength through the ratio H,?/p, wp to shock strengths greater than 
about 2-6. Above this value the field loses distinguishing control. 

For the remaining regions of the (H,?/p,—n) plane a third type of shock 
profile would appear to arise. This is essentially a small shock disturbance 
(7 <2-6) and extends again over a range of a few thousand mean free paths. 

Velocity, temperature and field strength change smoothly throughout 
the shock, no sharp change of variables now occurring. 

So far only longitudinal shocks in the presence of a transverse magnetic 
field have been treated. It would be most interesting to have the theory 
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developed to include oblique (i.e. transverse) shocks ; it can be expected 
that the profiles and features would be found to be essentially the same as 
those of longitudinal shock theory. Some especial astrophysical importance 
might be associated with this case, however. As a transverse shock moves 
out from the main body of an ionized medium in the presence of a magnetic 
field it moves continually into regions of decreasing density. Consequently, 
the ratio of magnetic to material energy density steadily increases. The 
velocity steadily approaches that of light, and on the boundary there is 
a transverse disturbance moving with a velocity indistinguishable from 
that of an electromagnetic disturbance, which also has a transverse 
character. Under some conditions it may be that the disturbance leaves 
the gas (i.e. energy is radiated) having an associated disturbance field that 
is of electromagnetic character. The frequencies and wavelengths of the 
constituent harmonic waves will follow from a Fourier analysis of the 
shock profile. It is suggestive to ask if such components exist which fall 
in the radio spectrum and if the power of the radiation would be of 
sufficient strength to account for at least some of the emissions observed 
in radio astronomyy. 


§ 11. ConcLupING REMARKS 


It is seen that a considerable development in the theoretical aspects 
of magnetohydrodynamics has taken place in the last few years, but a 
full experimental test of the theory is still lacking ; this is so much the 
case that the theory has had to be developed very largely without experi- 
mental authority. Certainly the accumulation of reliable and detailed 
laboratory data presents an extreme challenge to the experimentalist, 
and is a great need at the present time. Our knowledge of astrophysical 
processes, although growing, is still not sufficient to allow the inherently 
favourable cosmical conditions to take the place of the laboratory as 
a source of material for guidance in the further development of the theory. 

The presence of a magnetic field allows a conducting fluid to more 
readily withstand shear, this faculty vanishing with the field. It is not 
unreasonable to regard hydrodynamics as a special case of magneto- 
hydrodynamics. This would imply the existence of theorems which are 
recognizably generalizations of those well known in ordinary hydro- 
dynamics. These latter theorems cannot be taken over directly to the 
more general case since they are intimately connected with two-dimensional 
space, while the presence of the field necessitates a full three-dimensional 
treatment. So far general magnetohydrodynamic theorems analogous to 
those of hydrodynamics are not known, and their discovery, if indeed they 
exist, would mark an important advance. Perhaps this might come about 
when more experimental data ultimately become available. Meanwhile 
the theory would appear to have an important role to play in fields as 
widely separated as those of industry and astrophysics. 
pee ee ee 

Note added in proof—A similar suggestion has independently been made 
by H. K. Sen (1956, Phys. Rev., 102, 5). 


ed 
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